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SPECIFICATION 
TO ALL WHOM IT MAY CONCERN: 

Be it known that Edward Perez -Reyes and Leanne L. Cribbs, 
citizens of the United states of America, and resident at 32 0 
South Birchwood Drive, Naperville, IL 60540-5033 and 1737 N. 
Natoma, Chicago, IL 60707, respectively, have invented a 
certain new and useful T - TYPE VOLTAGE - GATED CALCIUM CHANNELS 
AND METHOD OF USING SAME of which the following is a 
specification. 
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T-TYPE VOLTAGE-GATED CALCIUM 
CHANNELS AND METHOD OF USING SAME 



STATEMENT AS TO RIGHTS TO INVENTIONS MADE UNDER FEDERALLY 
SPONSORED RESEARCH AND DEVELOPMENT 

This invention was made with Government support under Grant Number HL58728 
awarded by the National Heart, Lung, and Blood Institute of the National Institutes of 
Health. The United States Government may have certain rights in this invention. 

TECHNICAL FIELD OF THE INVENTION 

The present invention relates to molecular biology, and more particularly to cloned 
T-type calcium channels. 

BACKGROUND OF THE INVENTION 

Biological membranes are themselves generally impermeable to ionic species. 
Thus, ions enter cells through regulated pores formed from membrane-associated 
proteins. Most of these regulated pores are voltage-dependent and are thus able to 
transduce changes in the transmembrane potential into ion flux. Voltage-gated ion 
channels form a l< superfamily" of related proteins (cf . Jan et al., Nature, 345, 672 ( 1 990)). 
Peculiar to this genus is a high degree of conservation in molecular structure. Generally, 
voltage-gated channels are membrane bound glycosylated proteins formed of many 
subunits. Large a subunits form a pore in the membrane that is selective for a given ionic 
species. Each a subunit contains four domains (I, II, III, and IV). Each channel domain 
has six putative transmembrane helical segments (S r S 6 ). In general, the segments within 
each domain are similar but not identical. Aside from overall structural conservation, 
certain charged residues within the domains are highly conserved among voltage-gated 
ion channels (Jan et al, supra; StUhmer et aL, Nature, 339, 597-603 (1989)). 

Differences in charged residues between groups of voltage-gated ion channels 
confer properties unique to each subgroup, such as ion selectivity. For example, most 
voltage gated ion channels are selective for either sodium, potassium or calcium. Known 
calcium channels require a ring of negative charge provided by glutamate residues found 
at similar locations in each of the domains (Y ang et al, Nature, 366, 158-61 (1993)). 

Voltage- gated channels are often classified on the basis of their electrophysiology. 
The resting membrane potential of most animal cells is between about -70 mV and -80 
mV. When the membrane becomes depolarized (movedtowards 0 mV), various 
membrane channels become activated (they are said to "open"). Thus, one category for 
classifying membrane channels is on the basis of the membrane potential necessary to 
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activate (or "gate") them (voltage dependency). For example, "T-type" calcium channels 
are activated at a lower voltage than L- or N-type channels (Nowycky et al., Nature, 316, 
440*43 (1985)). Other physiological properties are the activation kinetics, inactivation 
kinetics, tail current (deactivation kinetics), and single channel conductance. Thus, in 
comparison to other calcium currents, T-type calcium current is characteristically short 
(Chen et al., 1 Gen. Physiol, 96, 603*30 (1990)), and it exhibits characteristically slow 
activation kinetics near threshold, fast inactivation kinetics, and slow tail current (Randall 
et al., Neuropharmacol., 63, 879-93 (1997); Carbone et al., Nature, 310, 501-02 (1984); 
Nilius et al., Nature, 316, 443-46 (1985)), 

Calcium currents have been implicated in many neurological and muscular 
functions. For example, T-type calcium current is associated with cardiac pacemaker 
activity, pain transmission in the central nervous system, and in other physiological 
functions. Defects in T-type calcium current have been implicated in cardiac arrhythmia, 
hypertension, and epilepsy. Given their potential clinical value, the pharmacological 
properties of calcium channels have been the subject of extensive study. Most such 
studies have involved L-type channels because, unlike T-type channels, L-type calcium 
channels are readily purified from cell extracts. For example, L-type calcium channels 
have been purified using dihydropyridine drugs (e.g., nifedipine) which can bind with 
sufficiently high affinity to serve as a ligand for purifying L-type calcium channels. Such 
purified and cloned L-type calcium channels have been used to develop assays for drugs 
affecting L-type calcium channels (see, e.g.. U.S. Patents 5,429,921 and 5,386,025). 

While many electrophysiological characteristics of T-type calcium currents are 
known, the lack of isolated T-type channels has stalled research into the pharmacology 
and biophysics underlying the T-type calcium current, at least in comparison with other 
calcium channels. Indeed, while it is generally assumed that voltage-sensitive ion 
channels are responsible for the current, no such channel protein, nor any nucleic acid 
encoding such a protein, has been isolated. In view of the foregoing problems, there 
exists a need for an isolated T-type calcium channel and a nucleic acid encoding a T-type 
calcium channel. 

BRIEF SUMMARY OF THE INVENTION 
The present invention provides an isolated or substantially purified nucleic acid 
encoding a protein comprising at least one domain of a T-type calcium channel and cells 
expressing such nucleic acids. The present invention also provides an isolated or 
substantially purified T-type calcium channel and an isolated or substantially purified 
antibody molecule recognizing an epitope on a T-type calcium channel protein. 
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The present invention is useful for exploring the electrophysiology and 
pharmacology of the T-type calcium current. Such knowledge can lead to the 
development of drugs for potentiating or attenuating T-type calcium channels. Thus, the 
present invention provides an assay for identifying potential drugs affecting T-type 
calcium channels by exposing cells expressing a T-type calcium channel to a putative 
drug and then measuring the calcium flux in response to a change in membrane potential. 
The identification of drugs affecting T-type calcium channels will facilitate even greater 
understanding of the biophysics of these proteins. Furthermore, some such drugs could 
have potential clinical applications. 

The invention can best be understood with reference to the accompanying 
drawings and in the following detailed description of the preferred embodiments. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Figures 1A-1G show the complete nucleotide and amino acid sequences (SEQ ID 
NO:l and SEQ IDNO:2) of aT-type calcium channel (alG or C av T.l)), and the 
conserved functional domains. 

Figures 2A-2F show the complete nucleotide and amino acid sequences (SEQ ID 
NO:3 and SEQ ID NO:4) of a T-type calcium channel (alH or C^T.2), indicating 
conserved functional domains. 

Figure 3 compares the amino acid sequences of domains of the T-type calcium 
channels with those of other calcium channels. 

Figures 4A-4D are graphic representations of the current- voltage relationships of 
two cloned T-type calcium channels (Figures 4A and 4B), a native T-type calcium current 
in NIE-1 1 5 cells (Figure 4C), and a cloned R-type calcium channel (Figure 4D). 

Figure 5 A is a graphic representation of the average current- voltage curve for 
cloned T-type calcium channels (alG, closed circle, alH, open circle), a native T-type 
calcium current in NIE-1 15 cells (triangles), and a cloned R-type calcium channel (filled 
squares). Figures 5B and 5C are graphic representations of the conductance of calcium 
channels. Figure 5B compares the conductance in 2 mM BaCl 2 of cloned T-type calcium 
channels (alG, closed circle, alH, open circle), a native T-type calcium current in 
NIE-1 15 cells (triangles), and a cloned R-type calcium channel (filled squares). Figure 
5C compares the normalized conductance of a cloned T-type calcium channel at three 
different concentrations of BaCl 2 . 

Figures 6 A and 6B are graphic depictions of the kinetics of a cloned T-type 
calcium channel. Figure 6A compares the current recorded in cells expressing cloned 
T-type (alG) or L-type (a IE) calcium channels at -20 mV. Figure 6B compares the 
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voltage dependent time constants of cloned T-type calcium channel activation and 
inactivation. 

Figures 7A-7F are graphic depictions of the tail current of a cloned T-type calcium 
channel. Figures 7A and 7D depict tail current amplitudes for a 1 G and al H, 
5 respectively. Figures 7B and 7E depict tail current at several test potentials for alG and 
alH, respectively. Figures 7C and 7F depict average kinetics of the tail current as a 
flinctioa of repolarization potential for a 1 G and a 1 H, respectively. 

Figures 8A-C graphically illustrate the voltage dependence of the inactivation of a 
cloned T-type calcium channel. Figure 8A illustrates the inactivation of cloned T-type 
10 calcium channels due to 200 ms pre-pulses at various potentials followed by a test pulse 
to -20 mV. Figure 8B compares the inactivation of cloned T-type (circles) and R-type 
(squares) calcium channels due to 200 ms pre-pulses at various potentials followed by a 
test pulse to -20 mV in comparison to a -100 mV control. Figure 8C depicts the voltage 
dependence of inactivation induced by 10 s pre-pulses for cloned T-type (circles) and 
^ 1 5 R-type (squares) calcium channels. 

J\ Figures 9A-9C graphically illustrate the single channel conductance of a cloned 

T-type calcium channel. Figure 9 A depicts the raw data collected from a patch of 
membrane on an oocyte expressing a cloned T-type calcium channel at various voltage 
protocols. Figure 9B represents the ensemble current recorded from 100 sweeps. Figure 
] 20 9C graphically illustrates the single channel amplitude plotted against test potential. 
= Figures 10A and 10B graphically present data concerning the use of a cloned T- 

type calcium channel to detect drugs affecting the channel. Figure 10A depicts the effect 
of 100 \iM on current- voltage relationships with a single dosage of miberfradil. Figure 
10B illustrates the effect on T-type channel conductance of various doses of miberfradil. 

25 

DESCRIPTION OF THE PREFERRED EMBODIMENTS 

The present invention provides an isolated or substantially purified nucleic acid 
encoding a protein comprising at least one domain of a T-type calcium channel a subunit. 
The nucleic acid can be of any type, and it can include other elements aside from a 

30 sequence encoding a T-type calcium channel domain or domains. For example, where the 
nucleic acid comprises RNA, it can also include regulatory sequences suitable to permit 
translation of the RNA. Thus, an RNA nucleic acid of the present invention preferably 
has at least one ribosome entry site, and preferably has a poly-adenosine tail for 
stabilizing the RNA in the cellular environment 

35 Similarly, DNA nucleic acids of the present invention can have regulatory 

elements for promoting the transcription of sequence encoding the T-type calcium 
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channel into an RNA such as that described above. For example, a DNA nucleic acid of 
the present invention can have a promoter and/or an enhancer sequence. 

The choice of promoter and/or an enhancer will largely depend on the milieu in 
which the nucleic acid is to be expressed. Thus, for expression in bacterial cells, the 
regulatory elements are bacterial promoters. Similarly, for expression in mammalian 
cells, the regulatory elements are able to effect expression in mammalian cells. 

While many such regulatory elements are known in the art, examples include 
prokaryotic promoters and viral promoters (e.g., retroviral ITRs, LTRs, immediate early 
viral promoters (IEp), such as herpesvirus IEp (e.g., lCP4-IEp and ICPO-IEp), 
cytomegalovirus (CMV) IEp, and other viral promoters, such as Rous Sarcoma Virus 
(RSV) promoters, and Murine Leukemia Virus (MLV) promoters). Other suitable 
promoters are eukaryotic promoters, such as enhancers (e.g., the rabbit p-globin 
regulatory elements), constitutively active promoters (e.g., the p-actin promoter, etc.), 
signal specific promoters (e.g., inducible promoters such as a promoter responsive to 
RU486, etc.), and tissue-specific promoters (e.g., those active in epidermal tissue, dermal 
tissue, tissue of the digestive organs (e.g., cells of the esophagus, stomach, intestines, 
colon, etc., or their related glands), smooth muscles, such as vascular smooth muscles, 
cardiac muscles, skeletal muscles, lung tissue, hepatocytes, lymphocytes, endothelial 
cells, sclerocytes, kidney cells, glandular cells (e.g., those in the thymus, ovaries, testicles, 
pancreas, adrenals, pituitary, etc.), tumor cells, cells in connective tissue, cells in the 
central nervous system (e.g., neurons, neuralgia, etc.), cells in the peripheral nervous 
system, and other cells of interest). While the nucleic acid can be any type of nucleic acid, 
the nucleic acid preferably comprises a cDNA. A cDNA nucleic acid is preferred over 
other nucleic acids to permit the nucleic acid to be readily cloned, sequenced, and 
expressed in a wide variety of cells. 

The isolated or substantially purified nucleic acid of the present invention encodes 
all or part of a T-type calcium channel a subunit. As used herein, a "calcium channel" 
includes a protein structure for facilitating the flux of calcium ions across a biological 
membrane into which the calcium channel is inserted. As used herein, a "T-type channel" 
is a type of voltage-gated ion channel that facilitates the flux of ions when the membrane 
potential of a biological membrane into which it is inserted experiences a slight 
depolarization. Thus, a T-type calcium channel can gate at about -45 mV to about -30 
mV (i.e., about -40 mV to about -35 mV) in 2 mM Ba 2+ . Additionally, T-type channels of 
the present invention exhibit a slow deactivation (tail current) following depolarization. 
Thus, a T-type calcium channel can exhibit a tail current that decays exponentially with a 
tau value from about 2 ms to about 10 ms (e,g., from about 4 ms to about 7 ms, such as 
about 6 ms) following repolarization to a membrane potential from about -80 mV to 
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about -60 mV in a solution with a barium concentration of from about 1 0 mM to about 40 
mM. Another defining characteristic of T- type calcium channels is that they exhibit small 
single channel conductance. Thus, for example, a T-type channel exhibits a single 
channel conductance of from about 7 pS to about 10 pS (e.g., from about 7.5 pS to about 
5 9.5 pS), and typically from about 8 pS to about 9 pS in a solution with a barium ion 
concentration of about 0.1 M, 

The isolated or substantially purified nucleic acid of the present invention encodes 
all or part of any T-type calcium channel having at least one of the aforementioned 
electrophysiological properties when properly assembled within a cellular membrane. 
1 0 The general structure of calcium channels is summarized above and is otherwise known in 
the art. Thus, for example, the nucleic acid can encode one of the four functional 
domains mentioned above. As used herein, a domain of a T-type calcium channel is any 
*= protein structure able to associate with three other domains to form a tetrameric body 

rg functioning as a T-type calcium channel. While the native T-type calcium channel 

^ 15 structure includes all four domains in a single polypeptide (indicated in Figures 1A-G and 
1/ 2A-2F), a domain can exist as a polypeptide species separate from those containing the 

other domains. Such separate domains are able to associate within the plasma membrane 
JS. to form a functional channel. Alternatively, where a plurality of domains are linked 

= within a common polypeptide, the linkage can deviate substantially from the native 

jT. 20 linkage. Thus, for example, the domains can be linked by polypeptide sequences other 
Q than those sequences (see, e.g., Figures 1 A-l G and 2A-2F) linking the domains in the 

y native protein (e.g., non-native polyglutamate linkages). Indeed, the domains themselves 

can include non-native linkages between membrane-spanning elements within the 
domains. Aside from these modifications, the nucleic acid can encode a chimeric calcium 
25 channel domain (or an entire channel) comprising a portion of a T-type calcium channel 
and a portion derived from another calcium channel (or other channel) protein. For 
example, the chimera can include portions of domains from T-type channels responsible 
for low voltage gating and portions of domains from other calcium channels responsible 
for slow inactivation. Such a protein exhibiting T-type gating but longer inactivation 
30 kinetics would facilitate pharmacological research. 

As mentioned, nucleic acids of the present invention can encode an entire T-type 
channel (i.e., a T-type channel protein comprising four functional domains). Examples of 
the amino acid sequences of two full-length T-type channels are set forth at SEQ ID 
NO: I, SEQ ID NO:3, and examples of sequences encoding full length T-type calcium 
35 channels are SEQ ID NO:2 and SEQ ID NO:4. However, the invention is not limited to 
these exemplary sequences. Indeed, as mentioned, an amino acid sequence of a T-type 
calcium channel can vary from those listed, and it is within the state of the art to change a 
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nucleotide sequence encoding a T-type channel to introduce mutations into the protein. 
For example, mutations comprising insertions or deletions can be introduced on either the 
amino- or carboxy-terminus of the protein, or such mutations can be intrasequence 
insertions or deletions. Where the electrophysiological properties of the calcium channel 
are to be conserved, such mutations preferably are in regions other than the membrane 
spanning domains (see, e.g., Figures 1 A-1G And 2A-2F). For example, SEQ ID NO:5 
and SEQ ID NO: 6 are the sequences of two T-type channels having deletions in the 
region linking domains EI and IV. However, in some applications (e.g., to decrease 
inactivation kinetics), the changes can be within the membrane-spanning regions. 
Moreover, as mentioned above, the sequence can form a protein having only one 
functional domain of a T-type calcium channel. Additionally, the sequence can also form 
a chimeric protein or domain, such as those described above. 

Aside from insertions and deletion mutations of native T-type calcium channel 
sequences, a T-type calcium channel can include substitutions of amino acid residues, 
e.g., for those indicated in SEQ ID NO:l and SEQ ID NO:3. Preferably, and especially 
where such a substitution is within a membrane spanning region, the substitution is 
conservative. Thus, within membrane spanning domains, positively-charged residues (H, 
K, and R) preferably are only substituted with positively-charged residues; 
negatively-charged residues (D and E) preferably are only substituted with 
negatively-charged residues; neutral polar residues (C, G, N, Q, S, T, and Y) preferably 
are only substituted with neutral polar residues; and neutral non-polar residues (A, F, I, L, 
M, P, V, and W) preferably are only substituted with neutral non-polar residues. Figure 3 
indicates the conservation between the S-IV domains of T-type calcium channel a 
subunits and those of other calcium channels. Preferably, any amino-acid substitution 
within the membrane-spanning regions does not alter this conservation. Most preferably, 
any substitution, deletion, or insertion does not alter the IVs4 domain. In each of the 
exemplary T-type calcium channel a subunit sequences (SEQ ID NO:l and SEQ ID 
NO:3, SEQ ID NO:5 t and SEQ ID NO:6), the putative S4 region comprises Arg lie Met 
Arg Val Leu Arg lie Ala Arg Val Leu Lys Leu Leu Lys Met Ala Val Gly Met Arg Ala 
(SEQ ID NO:7). Given the strong sequence conservation among families of 
voltage-gated ion channels, it is likely that SEQ ID NO:7, or a derivative sequence, will 
be present in T-type channels. Thus, the present invention provides any T-type calcium 
channel (or a nucleic acid encoding such a T-type calcium channel) comprising SEQ ID 
NO:7 or a sequence derived from SEQ ID NO:7 having conservative amino acid 
substitutions, as described above. 

The nucleic acid of the present invention encoding all or a part of a T-type calcium 
channel can be isolated via any suitable method. For example, prior to the present 
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invention, one of skill in the art could design a probe based on the sequence of known, 
non-T-type, calcium channels and use such probe to screen a genetic library. If such a 
screen were to identify a putative calcium channel, the researcher could then attempt to 
clone the entire nucleic acid to characterize it. Similarly, prior to the present invention, to 
5 isolate a nucleic acid encoding a T-type calcium channel, one of skill in the art could 

consult publicly available databases containing DN A sequences (e.g., Genbank) to locate 
nucleic or amino acid sequences representing a portion of a T-type calcium channel 
protein or nucleic acid. However, such databases contain no sequence for a full-length 
T-type calcium channel. Such methods assume that T-type calcium channels share 
10 sufficient sequence identity with known calcium channel nucleic acids to cross-hybridize, 
an assumption not supported by any published report. Moreover, prior to the present 
invention, no partial sequence in such databases was identified as corresponding to a 
T-type calcium channel. Thus, prior to the present invention, the presence of partial 
S3 sequences in the public DNA databases could facilitate the isolation of T-type calcium 

jij 1 5 channels only with the exercise of a considerable degree of speculation on the part of the 
u\ researcher. 

By providing several sequences pertaining to T-type calcium channels and a 
comparison presenting conserved regions and domains, the present invention greatly 
facilitates the isolation of other nucleic acids encoding T-type calcium channels (or 
pj 20 derivatives thereof) with much less experimentation. Thus, while any of the methods 
C discussed above can be employed to isolate other members of this genus, preferably, a 

/y nucleic acid encoding a T-type calcium channel is isolated by probing a genetic library 

-J using a probe that hybridizes to a DNA encoding a peptide sequence contained in (or 

similar to) a known T-type calcium channel (e.g., SEQ ID NO:l or SEQ ID NO:3). To 
25 facilitate the isolation of a T-type calcium channel, the present invention provides an 
isolated polynucleotide hybridizing to a portion of the nucleic acid of the present 
invention encoding a T-type calcium channel (or a portion thereof). Thus, for example, 
the present invention includes an isolated polynucleotide hybridizing to SEQ ID NO:2 or 
SEQ ID NO:4. The isolated polynucleotide can hybridize to all or any portion of the 
30 sequence encoding the T-type calcium channel. 

To isolate such a polynucleotide, any portion of a sequence encoding a T-type 
calcium channel can be employed as a probe to screen a genetic library, and such 
screening can be accomplished by standard techniques known in the art. While the probe 
can hybridize to any portion of such a DNA, preferably the probe is designed to hybridize 
35 to a DNA encoding a polypeptide sequence that is highly conserved among T-type 

calcium channels but is less conserved between the genus of T-type calcium channels and 
other proteins. Such peptide sequences are readily apparent from the sequence 
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comparison set forth in Figure 3. Generally, the specificity of hybridization in a genetic 
screen varies depending on the length of the probe and the stringency (e.g., temperature, 
salt and detergent concentration, etc.) of hybridization. Stringency of hybridization is 
broadly classified as "high," "moderate," or "low " and the parameters of these terms are 
well recognized in the art (see, e.g., Sambrook et al., "Molecular Cloning, a Laboratory 
Manual," Cold Spring Harbor Press, 1989). The isolated polynucleotide hybridizing to a 
portion of the nucleic acid encoding a T-type calcium channel can hybridize under any 
desired stringency conditions. However, for identifying other T-type channels, 
preferably, the hybridization occurs under moderate stringency, and most preferably 
under high stringency. 

Of course, the isolated or substantially purified polynucleotide can itself be 
employed as a probe to screen a library as described to isolate a second nucleic acid. In 
such a screen, one of the polynucleotides will be complementary to a portion of the 
sequence encoding the T-type calcium channel, and the other isolated nucleic acid will be 
"sense." Preferably, one of the two isolated polynucleotides (the "sense" strand) itself 
encodes a T-type calcium channel, or at least one domain thereof. Such a sequence can 
be cloned to be operably linked to suitable regulatory elements, as described, to produce a 
T-type calcium channel. Thus, aside from using the nucleic acid of the present invention 
to produce a T-type calcium channel, the nucleic acids of the present invention are also 
useful for isolating other sequences encoding T-type calcium channels, or derivatives 
thereof. 

However isolated, the isolated or substantially purified nucleic acid of the present 
invention is useful, in part, for producing all or a portion of a T-type calcium channel. 
Thus, the nucleic acid can be introduced into a suitable milieu for driving its expression. 
Because T-type channels are transmembrane proteins, preferably such a milieu is a living 
cell. However, it should be understood that the nucleic acid can also be expressed in vitro 
under conditions, such as those known in the art, suitable for in vitro transcription and 
translation. However produced, the present invention includes any protein, such as a 
recombinant protein or an isolated or substantially purified protein, including all or a 
portion of a T-type calcium channel or a protein derived from a T-type calcium channel. 
Such proteins are described above. 

For expression in a living cell, the nucleic acid must be introduced into the celL 
As nucleic acids are generally introduced into cells as part of genetic vectors, the present 
invention provides a vector having a T-type calcium channel nucleic acid of the type 
described above. Any type of vector suitable for introducing the nucleic acid into a host 
cell is within the context of the present invention. Examples of such vectors include 
naked DNA and RNA vectors (such as oligonucleotides, plasmids, capped cRNA, etc.), 
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viral vectors such as adeno-associated viral vectors (Berns et al, Annals of the New York 
Academy of Sciences, 772, 95-104 (1995)), adenoviral vectors (Bain et al., Gene Therapy, 
1, S68 (1994)), herpesvirus vectors (Fink et a!., Ann. Rev. NeuroscL, 19, 265-87 (1996)), 
packaged amplicons (Federoff et al. } Proc. Nat Acad. ScL USA, 89, 1636-40 (1992)), 
5 pappiloma virus vectors, picornavirus vectors, polyoma virus vectors, retroviral vectors, 
SV40 viral vectors, vaccinia virus vectors, and other vectors. Once a given type of vector 
is selected, its genome must be manipulated for use as a background vector, after which it 
must be engineered to incorporate exogenous polynucleotides. Such manipulations are 
known in the art. 

10 The vectors of the present invention are useful for introducing a nucleic acid 

encoding all or a portion of a T-type calcium channel into a host cell. Thus, the present 
invention provides a cell into which the vector of the present invention has been 
introduced. The host cell can be any cell suitable for expressing the nucleic acid (e.g., 
bacteria, insect cells, mammalian cells, etc.). The host cell can thus be in vitro or in vivo- 
S 1 5 Preferably the cells do not exhibit native T-type calcium current. A preferred cell type is 
HEK-293 cells because they contain genetic elements that facilitate the expression of 
transgenes from a variety of expression vectors. For facilitating electrophysiological 
recordings, oocytes (e.g., Xenopus oocytes) are preferred, as they are large and readily 
handled. 

|-* s 20 The vector can be introduced into the cell in any manner suitable for the cell type 
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U and vector employed. In one embodiment, the vector can be used to prepare an RNA 



transcript in vitro (e.g., a capped cRNA) which is then introduced into the host cell by 
standard methods (such as injection). Such techniques are preferred when the host cells 
do not actively transcribe DNA (such as oocytes). In other embodiments, a DNA vector 

25 is introduced into the cell such that it is transcribed within the cell. For example, the 

vector can be introduced into the cell such that it forms an extrachromosomal segment of 
genetic material in the cell, as is the case with many types of viral vectors. Alternatively, 
the vector can introduce the nucleic acid into the chromosomal DNA of the host cell. In 
this respect, a cell line comprising chromosomes into which the T-type calcium channel 

30 nucleic acid has been introduced is able to propagate the nucleic acid through several 
passages (e.g., for at least 10 passages), and, preferably, the nucleic acid is stably 
integrated into the chromosomes of such cells. Thus, the cell line can propagate the 
nucleic acid for at least 20 passages, and more preferably significantly more than 20 
passages (e.g., at least about 25 passages, or even more). 

35 Preferably, a cell into which the nucleic acid is introduced is also able to express 

the nucleic acid. The expression of the nucleic acid can be detected by probing the cell 
for the presence of T-type calcium channel mRNA, such as via Northern hybridization 
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analysis, in situ hybridization, etc. More preferably, however, the ceil is able to express 
the nucleic acid to produce the protein including all or a portion of a T-type calcium 
channel. In such cells, expression of the nucleic acid is confirmed by detecting the 
protein, for example, by probing cellular extracts with an antibody recognizing the protein 
(e.g., on a Western blot, etc.). Of course, the protein contributes to the formation of a 
functional calcium channel in the membrane of the cell producing the protein. Where the 
protein encodes an entire a subunit, the full protein will possess some or all of the 
electrophysiological properties of T-type calcium channels described above. Where the 
protein encodes less than an entire channel a subunit (e.g., a domain), the protein will 
aggregate with other constituent domains in the membrane to form a functional channel. 
Thus, the presence of the protein can be detected by assaying the cell for T-type calcium 
channel activity. Indeed, assaying for channel activity serves to determine whether a 
nucleic acid encoding a putative calcium channel, in fact, encodes a species of T-type 
channel (as opposed to a member of another genus of calcium channels). For example, 
when large cells (e.g., oocytes) are used as the host cells, the electrophysiological 
properties of the channel can be investigated. Thus, the membrane activity of whole cells 
expressing the nucleic acid can be measured directly, such as via patch clamp techniques 
using a voltage clamp electrode and a current electrode (Bernal et aL, 1 Pharmacol Exp. 
Then, 282, 172-80 (1997)). Alternatively, the activity of single channels can be 
measured, such as with a standard depolarizing bath and pipette solutions (Lacerda et al., 
Biophys, J. r 66, 183-43 (1994)). However measured, the properties of cells into which the 
putative nucleic acid is introduced are compared to the known channel conductance, 
voltage dependency, activation kinetics, inactivation kinetics, and tail current known for 
T-type channels and discussed above. 

Regardless of the cell system, the ability to express a T-type calcium channel 
nucleic acid within host cells to produce an active channel permits the channel to be 
further studied. In this regard, the present invention provides a method of identifying a 
drug which affects T-type calcium channels. The method involves first expressing a 
T-type calcium channel in a cell to produce an active channel, as herein described. The 
cell expressing the channel is then exposed to a solution containing a putative drug for 
interfering with the channel. Thereafter, the presence or absence of calcium flux in 
response to a change in membrane potential is assayed. A quick method of assaying for 
calcium flux is first to introduce a calcium-sensitive labile dye into the cells. For 
example, the dye can be one such as those that fluoresce or change color in the presence 
of calcium (e.g., Indo-1). Thereafter, the cells are exposed to a depolarizing solution 
containing high (e.g., about 50 mM) potassium concentration and a drug, and the reaction 
of the labile die is compared to control cells. Using a labile dye affords the ability to 
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assay many putative drugs quickly in a high throughput assay for putative drugs affecting 
T-type channels. For example, the initial screening can be carried out in 96 well plates. 
Moreover, dose-response data can be readily generated by exposing the cells to several 
concentrations of the same putative drug. 

Once a putative drug is detected, its effect on the electrophysiology of the cell 
(e.g., single channel conductance, voltage dependency, activation kinetics, inactivation 
kinetics, and tail current of the cells) can be investigated in detail. Generally, the effect of 
the putative drug on T-type calcium currents is assessed by measuring the various 
electrophysiological parameters in the presence of various concentrations of the drugs and 
comparing the data to untreated (or sham-treated) control cells. Ceils preferably are 
maintained in a continuous perfusion chamber during such experiments to facilitate 
changing solutions. The inventive method of identifying a drug which affects T-type 
calcium channels can employ any nucleic acid encoding a T-type calcium channel (or 
derivative thereof), such as those nucleic acids described herein. In fact, as several 
isoforms of T-type channel exist (e.g., alG and alH), the assay method can be repeated 
using nucleic acids encoding different isoforms to identify drugs that preferentially target 
a given isoform, or drugs which affect more than one isoform of T-type calcium channels. 

Aside from affording an in vitro assay for detecting potential therapeutic or 
investigative drugs targeting T-type calcium channels, the method of expressing the 
T-type calcium channel nucleic acid can also be used in vivo. For example, as mentioned, 
several neurological and muscular diseases or disorders have implicated mutations 
affecting native nucleic acids encoding T-type calcium channels. The present invention, 
thus, provides a method of treating a disease or disorder associated with a deficiency in a 
native T-type calcium channel nucleic acid. The method involves introducing a vector 
having the T-type calcium channel nucleic acid into cells of a host in which native 
expression of the nucleic acid is deficient Thus, for example, for treating 
cardiomyopathy associated with deficiencies in T-type calcium channels, the vector is 
introduced into myocardial cells. Similarly, for treating forms of epilepsy associated with 
deficiencies in T-type calcium channels, the vector is introduced into neurons (e.g., 
thalamic neurons). Within the target cells, the nucleic acid within the vector is expressed 
to produce active T-type calcium channel. By similar methods, an nucleic acid having a 
sequence antisense to a sequence encoding a T-type calcium channel (or a portion thereof) 
can be expressed within a cell. The presence of an antisense sequence can down-regulate 
the expression of native T-type calcium channel genes by hybridizing to T-type channel 
mRNA within the cell. Thus, the present invention is useful to treating disorders 
associated with over-expression of T-type calcium channels. 
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T-type channel proteins (such as whole T-type calcium channels, domains of such 
channels, chimeras including portions of T-type calcium channels, etc.) can be employed 
to generate antibodies (e.g., immunoglobulins) to T-type calcium channels. Thus, the 
present invention provides an isolated and substantially purified antibody molecule 
recognizing an epitope on a T-type calcium channel. Such antibodies can be monoclonal 
antibodies or polyclonal antisera. Such antibodies can be produced by any suitable 
method, many of which are well known in the art. Antibodies recognizing T-type calcium 
channels can be used to purify the channels from cell extracts or other solutions by 
standard methodologies (e.g., immunoprecipitation). Moreover, depending on the 
location of the epitopes for the antibodies on the T-type calcium channel, the antibodies 
can be used to affect the channel proteins present on the surface of cells. Thus, antibodies 
directed to T-type calcium channels are potential reagents for studying the channels as 
well as for therapy. 

EXAMPLES 

Several examples are presented below to illustrate the invention. Taken together, 
the examples demonstrate the cloning of two novel proteins and their characterization as 
T-type calcium channel a subunits. These examples are included here for purely 
illustrative purposes; as such, they are not to be construed so as to limit the scope of any 
aspect of the invention. 

Many procedures employed in the following examples are techniques routinely 
performed by one of ordinary skill in the art (see generally Sambrook et al., Molecular 
Cloning, A Laboratory Manual, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 
(1989)) and are not discussed in detail. However, some reagents and methods deserve 
specific description. Thus, for example, in vitro translation and expression were 
conducted as described previously (Schneider et al., Receptors and Channels, 2, 255-70 
(1995)). Xenopus laevis oocytes were prepared as described previously (Bernal et al., J. 
Pharmacol Exp. Ther., 282, 172-80 (1997)). To express proteins, 10 or 30 ng of capped 
cRNA was injected into the oocytes or NIE- 1 1 5 cells in a volume of 50 nL For single 
channel recording, oocytes were injected with 100 ng capped cRNA and incubated for 
one week prior to assay. 

Cells were voltage clamped using a two-microelectrode voltage clamp amplifier as 
described (Bemal et al, 1 Pharmacol. Exp. Ther. t 282, 172-80 (1997)). The standard 
bath solution contained the following: 40 mM Ba(OH) 2> 50 mM NaOH, 1 mM KOH, 0.1 
mM EDTA, and 5 mM HEPES, adjusted to pH 7.4 with methanesulfonate. The 
osmolality of the 2 and 1 0 mM Ba 2+ solutions was balanced by increasing the NaOH 
concentration as described (Lory et al., I Physio!., (London), 429, 95-1 12 (1990)). 
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Voltage and current electrodes (1.5-1 .8 M tip resistance) were filled with 3 M KC1. 
Except as noted, data were acquired at 4 kHz using the pCLAMP system, and filtered at 1 
kHz. Data were analyzed using pCLAMP software. Boltzman fits and linear regression 
were calculated using Prism. 

5 

EXAMPLE 1 

This example demonstrates the cloning and characterization of two putative T-type 
calcium channels. 

A search of the Genbank library was conducted to identify clones identified as 
10 having some degree of homology to calcium channels. The search identified an expressed 
sequence tagged (EST) partial sequence in a human brain clone (H06096), which was 
used as a probe to screen a Xgtl 0 cDNA library prepared from rat brain. Successive 
^ screening of the cDNA library identified five overlapping clones which were aligned to 

SJ construct an entire cDNA sequence, termed a 1 G (representing nucleotides 379-7540 of 

B 15 SEQ ID NO:2). 

i 'j% The a 1 G cDNA was cloned into the pSP72™ vector and sequenced by standard 

g computer-assisted sequencing. Using the al G cDNA, the amino acid sequence of the 

q a\G protein was deduced (SEQ ID NO:l) and compared to the sequences of other known 

- calcium channel a subunits. Figure 1 sets forth these sequences and subunits, and it 

20 . indicates the putative transmembrane domains of the protein. By similar methods, 
□ homologous human (H19230 and Rl 9524) and mouse (AA286626) EST clones were also 

^ identified and partially sequenced. 

\i A second T-type calcium channel, termed alH, was isolated by screening a human 

heart cDNA library with a fragment of the alG sequence. The cDNA sequence of alH is 
25 set forth at SEQ ED NO:4, and the deduced amino acid sequence is set forth at SEQ ID 
NO;3. Also, figure 2 sets forth these sequences and indicates the subunits and putative 
transmembrane domains of the protein. 

The alG and alH clones were compared to each other and a known calcium 
channel (a IE) to investigate the conservation of protein structure and function. The 
30 comparison indicates that the al G and alH amino acid sequences within the putative 
membrane-spanning domains are 91% identical to each other, while the alG and alH 
sequences are only 39% identical to the alE clone. Within the critical IVS4 region, the 
alG and alH proteins are 100% identical, while each is only 44% identical to the alE 
clone 

35 Figure 3 indicates this conservation between the proteins. The conservation of 

charged residues, particularly in the S4 domains, is consistent with the role of the alG 
and alH proteins as ion channels. However, two of the glutamates associated with ion 
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specificity in other calcium channels have been replaced with aspartate, suggesting altered 
ion selectivity. Strikingly, both alG and alH display only low homology to sequences 
linking the membrane-spanning regions within each domain, and even less homology 
between the intracellular loops linking domains. Notably, neither alG nor alH possesses 
5 sequences known to bind p submits or Ca 2+ ions. 

EXAMPLE 2 . 

This example demonstrates that the two cloned putative T-type calcium channels 
exhibit T-type current-voltage relationships. 
10 The a 1 G and ctl H proteins were produced in Xenopus laevis oocytes by 

linearizing the DNA vectors containing the coding sequences, and translating the coding 
sequences in vitro by standard methods. Oocytes were then injected with the capped 
RNA as described. 

Current traces were elicited by depolarizing voltage clamp pulses of the 
1 5 membranes of cells. Figures 4A-5E depict data obtained from these experiments using 
cells injected with alG and alH (Figure 4A and 4B, respectively) and alE (Figure 4C), 
as well as undifferentiated NIE-l 15 cells (Figure 4D), which exhibit classic T-type 
calcium current (Shuba et al M J. Physiol (London), 443, 25-44 (1991)). These data 
* indicate that cells expressing alG and alH (Figure 4A) exhibit T-type calcium current, 

20 while oocytes expressing a IE (Figure 4C) as well as uninjected oocytes (Figure 6A) do 
O not. 

*f Current voltage curves were developed using cells injected with alG, alH, and 

alE, as well as undifferentiated NIE-115 cells. Figure 5A depicts such data generated in 
a 10 mM Ba 2+ test solution. These data were transformed into conductance (Figure 5B) 
25 and fit with a Boltzman equation to determine the midpoint of activation (V 05 ), Both 
NIE-l 15 cells and alG currents exhibited low gating potentials (-41 mV ± 1 mV, n=10 
and -38 ± 1 mV n=8, respectively), while alE required significantly more positive 
potentials to open (-2.6 mV ± .4 mV, n~3). 

To compare the characteristics with published values (Huguenard, Ann. Rev. 
30 Physiol, 58, 329-48 (1996)), the alG current was recorded at varying concentrations of 
Ba 2+ . As indicated in Figure 5C, in solutions containing 2 mM Ba 2+ , V 0 5 was -46.5 mV, 
and the slope factor (k) was 6.6 (n~7). However, when the Ba 2+ concentration was 40 , 
mM, V o s was recorded at -21 mV, presumably due to the results of barium on surface 
charge screening (see, e.g., Wilson et aL J. Membrane Biol, 72, 1 17-30 (1983)). Similar 
3 5 values were recorded for a 1 H. 

These results indicate that alG and alH are low-voltage activated calcium 
channels. 
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EXAMPLE 3 

This example demonstrates that the cloned putative T-type calcium channels 
exhibit T-type kinetics. 

5 To measure activation and inactivation kinetics, oocytes injected with ctlE or otlG 

were pulsed with -20 mV current in 40 mM Ba 2+ . Data representing the average of five 
sweeps recorded at 2 kHz and filtered at 1 kHz are presented in Figure 6A. The time 
constants for alG inactivation and activation were determined by fitting the data with 
exponentials. These data are depicted in Figure 6B. These values correspond with the 

1 0 kinetics of the T-type calcium current. 

EXAMPLE 4 

^ This example demonstrates that the cloned putative T-type calcium channels 

05 exhibit T-type tail deactivation current. 

^ 1 5 Tail current was measured by prepulsing the ceils expressing al G (oocytes) and 

ctlH (HEK 293 cells) at -90 mV followed by periodic pulses at -10 mV or a pulse at -50 
mV. The recordings in Figures 7A and 7B indicate that the current elicited at -50 mV 
follows the current measured at -10 mV. These data confirm that the decline in current is 
r- due to inactivation, rather than activation of a contaminating outward current. 

Li 

20 The voltage-dependence of tail current was measured at varying test potentials. 

Data representing such studies are presented in Figures 7C and 7D, respectively. The data 
were fit with a single exponential and plotted as a function of depolarization potential 
(Figures 7E and 7F, respectively). These results demonstrate that the tail cunents for the 
two cloned calcium channels, alG and alH, are voltage-dependent, consistent with 
25 known T-type calcium tail currents. 

EXAMPLE 5 

This example demonstrates that the cloned putative T-type calcium channels 
exhibit T-type voltage dependent inactivation. 
30 To measure inactivation, oocytes expressing alG or a IE were subjected to 200 ms 

pre-pulses at various potentials followed by a test pulse to -20 mV. The results of these 
assays are depicted in Figure 8A. 

The data for the 200 ms prepulse experiments were averaged and plotted as a 
function of prepulse potential (Figure 8B, n= 2 or 4), with a control defined as the current 
35 measured after a prepulse of -100 mV. 

To approximate steady state conditions, similar experiments were conducted using 
10 s prepulses. Inactivation of alG occurred as sub-threshold potentials and displayed a 



ill 
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steep voltage dependence (V os = -50.0 ± 0.2 mV, k = -3.2 ± 0.2, n-5), while inactivation 
of cloned a IE exhibited more positive potential and weaker voltage dependence (V 05 = 
-30.0 i 0.4 mV, k = -9.4 ± 0.3, n=6). These data are depicted in Figure 8C. 

EXAMPLE 6 

This example demonstrates that the cloned putative T-type calcium channels 
exhibit T-type single channel conductance. 

Measurement of single channel conductance is a function of the low probability of 
channel opening at negative potentials when the driving force is large. Thus, single 
channel conductance was measured similarly for measurements of tail currents to enhance 
channel opening at negative potentials. Single channels were measured with standard 
depolarizing bath and pipette (1 1 5 mM BaC12, 1 mM EGTA, and 10 mM KEPES, pH 
7.4) solutions (Lacerda et al. t Biophys. J., 66, 1833-43 (1994)). Data were analyzed with 
TRANSIT (VanDongan, Biophys J., 70, 1303-15 (1996)). Single channel amplitudes 
were measured be averaging the values obtained from Gaussian fits to all-points 
histograms of traces with openings, selected openings, or amplitude histograms of 
idealized openings. It has been reported that some oocytes contain a native 9 pS channel. 
These endogenous channels can be distinguished by their 2-fold larger current amplitudes 
at the potentials tested (e.g., -20 mV, i - 0.8 for endogenous channels as opposed to 0.4 
pA for alG). However, such endogenous channels were not detected either at the whole 
cell or single channel level in the oocytes tested. 

Data were recorded from a patch in oocytes expressing large (>500 nA) alG 
currents using a 5 ms step to -20 mV followed by repolarization at potentials indicated in 
Figure 9A. Data were acquired at 10 kHz and filtered at 2 kHz online and again at 1 kHz 
off-line. The numbers on the right in Figure 9A indicate the numbers of channels open at 
any given time. 

Current through the main open state of each open channel was measured at each 
potential and plotted against each test potential. These data are depicted in Figure 9C. 
Single channel conductance for seven patches were averaged. The average slope 
conductance of the alG channel was measured at 7.5 ± 1.5 pS, which corresponds with 
the reported values for T-type calcium channels (Hugenard, Ann. Rev. PhsysioL, 58, 
329-48 (1996)). 

An ensemble current from 100 sweeps at a -40 mV test current was prepared from 
the idealized data and fit with a single exponential (t = 8 ms). This ensemble current is 
depicted in Figure 9B. This ensemble current exhibits decay kinetics similar to that 
observed in the macroscopic current measured above (see Figure 7A). 
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These results indicate that the cloned ctlG protein exhibits T-type single-channel 
conductance. 

EXAMPLE 7 

This example demonstrates that a cloned T-type calcium channel can be used for 
identifying a drug which affects T-type calcium channels. 

Cells were subjected to treatment as indicated above in Example 2, except that an 
experimental group of cells were exposed to a solution containing 100 jjM mibefradel, a 
known inhibitor of T-type calcium current As depicted in Figure 10A, the presence of 
mibefradel almost completely abolished T-type current in cells expressing alG. Cells 
were similarly treated using various concentrations of mibefradel to determine a dose- 
response relationship. These results, depicted in Figure 10B, demonstrate that 50% 
inhibition was achieved at a mibefradel concentration of 23 fiM. 

All of the references cited herein, including patents, patent applications, and 
publications, are hereby incorporated in their entireties by reference. 

While this invention has been described with an emphasis upon preferred 
embodiments, it will be obvious to those of ordinary skill in the art that variations of the 
preferred embodiments may be used and that it is intended that the invention may be 
practiced otherwise than as specifically described herein. Accordingly, this invention 
includes all modifications encompassed within the spirit and scope of the invention as 
defined by the following claims. 
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(1) GENERAL INFORMATION: 

{i) APPLICANT: 

(A) NAME: Edward Perez-Reyes 

(B) STREET: 2160 S. First Avenue, Building 102, Room 4 669 

(C) CITY: Maywood 

(D) STATE: IL 

(E) COUNTRY; US 

(F) POSTAL CODE (ZIP): 60153 
<G) TELEPHONE: (708) 216-6305 
<H) TELEFAX: 

(I) TELEX: 

(ii) TITLE OF INVENTION: P-Type Voltage- 
Gated Calcium Channels and Method of Using 
Same 

.(iii) NUMBER OF SEQUENCES: 5 

(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 
<B> COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC- DOS/MS-DOS 

(D) SOFTWARE: Patently Release #1.0, Version #1.30 (EPC) 

(2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6096 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: unknown 
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(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

ATG ACC GAG GGC GCA CGG GCC GCC GAC GAG GTC CGG GTG CCC CTG GGG 

Met Thr Glu Gly Ala Arg Ala Ala Asp Glu Val Arg Val Pro Leu Gly 

1 5 10 15 

CGC CGC CCC TGG CCC TGC GGC GTT GGT GGG GGC GTC CCC GGA GAG CCC 

Arg Arg Pro Trp Pro Cys Gly Val Gly Gly Gly Val Pro Gly Glu Pro 

20 25 30 

CGG GGC GCC GGG ACG CGA GGC GGA GGG GGG TTC GAG CTC GGC GTG TCA 
Arg Gly Ala Gly Thr Arg Gly Gly Gly Gly Phe Glu Leu Gly Val Ser 

35 40 45 

CCC TCC GAG AGC CCG GCG GCC GAG CGC TGC GCG GAG CTG GGT GCC GAC 

Pro Ser Glu Ser Pro Ala Ala Glu Arg Cys Ala Glu Leu Gly Ala Asp 
50 55 60 



GAG GAG CAG CGC GTC CCG TAC CCG GCC TTG GCG GCC ACG GTC TTC TTC 
Glu Glu Gin Arg Val Pro Tyr Pro Ala Leu Ala Ala Thr Val Phe Phe 
65 70 75 80 

TGC CTC GGT CAG ACC ACG CGG CCG CGC AGC TGG TCC GTC CGG CTG GTC 
Cys Leu Gly Gin Thr Thr Arg Pro Arg Ser Trp Ser Val Arg Leu Val 
85 90 95 



TGC AAC CCA TGG TTC GAG CAC GTG AGC ATG CTG GTA ATC ATG CTC AAC 
Cys Asn Pro Trp Phe Glu His Val Ser Met Leu Val He Met Leu Asn 
1C0 105 110 



TGC GTG ACC CTG GGC ATG TTC CGG CCC TGT GAG GAC GTT GAG TGC GGC 
Cys Val Thr Leu Gly Met Phe Arg Pro Cys Glu Asp Val Glu Cys Gly 
115 120 125 
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TCC GAG CGC TGC AAC ATC CTG GAG GCC TTT GAC GCC TTC ATT TTC GCC 
Ser Glu Arg Cys Asn He Leu Glu Ala Phe Asp Ala Phe He Phe Ala 
130 135 140 

TTT TTT GCG GTG GAG ATG GTC ATC AAG ATG GTG GCC TTG GGG CTG TTC 
Phe Phe ALa Val Glu Met Val He Lys Met Val Ala Leu Gly Leu Phe 
145 150 155 160 

GGG CAG AAG TGT TAC CTG GGT GAC ACG TGG AAC AGG CTG GAT TTC TTC 
Gly Gin Lys Cys Tyr Leu Gly Asp Thr Trp Asn Arg Leu Asp Phe Phe 
165 170 175 

ATC GTC GTG GCG GGC ATG ATG GAG TAC TCG TTG GAC GGA CAC AAC GTG 
He Val Val Ala Gly Met Met Glu Tyr Ser Leu Asp Gly His Asn Val 
180 185 190 

AGC CTC TCG GCT ATC AGG ACC GTG CGG GTG CTG CGG CCC CTC CGC GCC 
Ser Leu Ser Ala He Arg Thr Val Arg Val Leu Arg Pro Leu Arg Ala 
195 200 205 

ATC AAC CGC GTG CCT AGC ATG CGG ATC CTG GTC ACT CTG CTG CTG GAT 
He Asn Arg Val Pro Ser Met Arg He Leu Val Thr Leu Leu Leu Asp 
210 215 220 

ACG CTG CCC ATG CTC GGG AAC GTC CTT CTG CTG TGC TTC TTC GTC TTC 
Thr Leu Pro Met Leu Gly Asn Val Leu Leu Leu Cys Phe Phe Val Phe 
225 23C 235 240 

TTC ATT TTC GGC ATC GTT GGC GTC CAG CTC TGG GCT GGC CTC CTG CGG 
Phe He Phe Gly He Val Gly Val Gin Leu Trp Ala Gly Leu Leu Arg 
245 250 255 



AAC CGC TGC TTC CTG GAC ACT GCC TTT GTC AGG AAC AAC AAC CTG ACC 
Asn Arg Cys Phe Leu Asp Ser Ala Phe Val Arg Asn Asn Asn Leu Thr 
260 265 270 



TTC CTG CGG CCG TAC TAC CAG ACG GAG GAG GGC GAG GAG AAC CCG TTC 



22 

Phe Leu Arg Pro Tyr Tyr Gin Thr Glu Glu Gly Glu Glu Asn Pro Phe 
275 280 285 



ATC TGC TCC TCA CGC CGA GAC AAC GGC ATG CAG AAG TGC TCG CAC ATC 912 
5 lie Cys Ser Ser Arg Arg Asp Asn Gly Met Gin Lys Cys Ser His He 

290 295 300 

CCC GGC CGC CGC GAC GTG CGC ATG CCC TGC ACC CTG GGC TGG GAG GCC 960 
Pro Gly Arg Arg Asp Val Arg Met Pro Cys Thr Leu Gly Trp Glu Ala 
10 305 310 315 320 



TAC ACG CAG CCG CAG GCC GAG GGG GTG GGC GCT GCA CGC AAC GCC TGC 1008 

^ Tyr Thr Gin Pro Gin Ala Glu Gly Val Gly Ala Ala Arg Asn Ala Cys 

31 325 330 335 

\n ATC AAC TGG AAC CAG TAC TAC AAC GTG TGC CGC TCG GGT GAC TCC AAC 1056 

g lie Asn Trp Asn Gin Tyr Tyr Asn Val Cys Arg Ser Gly Asp Ser Asn 

340 345 350 

3. ' 

-,20 CCC CAC AAC GGT GCC ATC AAC TTC GAC AAC ACC TGC TAC GCC TGG ATT 1104 

C Pro His As n Gly Ala He Asn ?he Asp Asn Thr Cys Tyr Ala Trp He 

355 360 365 

GCC ATC TTC CAG GTG ATC ACG CTG GAA GGC TGG GTG GAC ATC ATG TAC 1152 

25 Ala He Phe Gin Val lie Thr Leu Glu Gly Trp Val Asp He Met Tyr 
370 375 380 



TAC GTC ATG GAC GCC CAC TCA TTC TAC AAC TTC ATG- TAT TTC ATC CTG 1200 
Tyr Val Met Asp Ala His Ser Phe Tyr Asn Phe He Tyr Phe He Leu 
30 385 390 395 400 



35 



CTC ATC ATC GTG GGC TCC TTC TTC ATG ATC AAC CTG TGC CTG GTG GTG 
Leu lie lie Val Gly Ser Phe Phe Met lie Asn Leu Cys Leu Val Val 
405 410 415 



1248 



ATT GCC ACG CAG TTC TCG GAG ACG AAG CAG CGG GAG AGT CAG CTG ATG 
lie Ala Thr Gin Phe Ser Glu Thr Lys Gin Arg Glu Ser Gin Leu Met 



1296 



420 
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425 



430 



CGG GAG CAG CGG GCA CGC CAC CTG TCC AAC GAC AGC ACG CTG GCC AGC 
Arg Glu Gin Arg Ala Arg His Leu Ser Asn Asp Ser Thr Leu Ala Ser 
435 440 445 



1344 



10 



p 

£15 
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TTC TCC GAG CCT GGC AGC TGC TAC GAA GAG CTG CTG AAG TAC GTG GGC 1392 
Phe Ser Glu Pro Gly Ser Cys Tyr Glu Glu Leu Leu Lys Tyr Val Gly 
450 455 460 

CAC ATA TTC CGC AAG GTC AAG CGG CAG CTT GCG CCT CTA CGC CCG CTG 1440 
His He Phe Arg Lys Val Lys Arg Gin Leu Ala Pro Leu Arg Pro Leu 
465 l - 470 475 480 

GCA GAG CCG TGG CGC AAG AAG GTG GAC CCC AGT GCT GTG CAA GGC CAG 1488 
Ala Glu Pro Trp Arg Lys Lys Val Asp Pro Ser Ala Val Gin Gly Gin 
485 490 495 

GGT CCC GGG CAC CGC CAG CGC CGG GCA GGC AGG CAC ACA GCC TCG GTG 1536 
Gly Pro Gly His Arg Gin Arg Arg Ala Gly Arg His The Ala Ser Val 
500 505 510 
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CAC CAC CTG GTC TAC CAC CAC CAT CAC CAC CAC CAC CAC CAC TAC CAT 
His His Leu Val Tyr His His His His His His His His His Tyr Kis 
515 520 525 



1584 



30 



TTC AGC CAT GGC AGC CCC CGC AGG CCC GGC CCC GAG CCA GGC GCC TGC 1632 
Phe Ser His Gly" Ser Pro Arg Arg Pro Gly Pro Glu Pro Gly Ala Cys 
530 535 540 

GAC ACC AC-G CTG GTC CGA GCT GGC GCG CCC CCC TCG CCA CCT TCC CCA 1680 
Asp Thr Arg Leu Val Arg Ala Gly Ala Pro Pro Ser Pro Pro Ser Pro 
545 550 555 560 



35 GGC CGC GGA CCC CCC GAC GCA GAG TCT GTG CAC AGC ATC TAC CAT GCC 

Gly Arg Gly Pro Pro Asp Ala Glu Ser Val His Ser He Tyr His Ala 
565 570 575 
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GAC TGC CAC ATA GAG GGG CCG CAG GAG AGG GCC CGG GTG GGC AC A TGC 177 6 

Asp Cys His He Glu Gly Pro Gin Glu Arg Ala Arg Val Gly Thr Cys 
580 585 590 

5 

CGC AGC CAC TGC CGC TGC CAG CCT CAG GCT GGC CAC AGG GCT GGG CAC 1824 
Arg Ser His Cys Arg Cys Gin Pro Gin Ala Gly His Arg Ala Gly His 
595 600 605 

10 CAT GAA CTA CCC CAC GAT CCT GCC CTC AGG GGT GGG CAG CGG CAA AGG 1872 

His Glu Leu Pro His Asp Pro Ala Leu Arg Gly Gly Gin Arg Gin Arg 
610 615 620 

CAG CAC CAG CCC CGG ACC CAA GGG GAA GTG GGC CGG TGG ACC GCC AGG 1920 
*£' 15 Gin His Gin Pro Arg Thr Gin Gly Glu Val Gly Arg Tro Thr Ala Arg 

«^ 625 630 635 640 

m 

'z CAC CGG GGG CAC GGC CCG TTG AGC TTG AAC AGC CCT GAT CCC TAC GAG 1969 

r His Arg Gly His Gly Pro Leu Ser Leu Asn Ser Pro Asp Pro Tyr Glu 

^ 20 645 650 655 
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AAG ATC CCG CAT GTG GCC GGG GAG CA? GGA CTG GCC AGC CCT GGC CAT 2016 

Lys He Pro His Val Ala Gly Glu His Gly Leu Ala Ser Pro Gly His 

660 665 670 

CTG TCG GGC CTC AGT GTG CCC TGC CCC CTG CCC AGC CCC CCA GCG GGC 2064 

Leu Ser Gly Leu Ser Val Pro Cys Pro Leu Pro Ser Pro Pro Ala Gly 
675 680 685 



30 ACA CTG ACC TGT GAG CTG AAG AGC TGC CCG TAC TGC ACC CGT GCC CTG 2112 

Thr Leu Thr Cys Glu Leu Lys Ser Cys Pro Tyr Cys Thr Arg Ala Leu 
690 695 700 

GAG GAC CCG GAG GGT GAG CTC AGC GGC TCG GAA AGT GGA GAC TCA GAT 2160 
35 Glu Asp Pro Glu Gly Glu Leu Ser Gly Ser Glu Ser Gly Asp Ser Asp 

705 710 715 720 
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GGC CGT GGC GTC TAT GAA TTC ACG CAG GAC GTC CGG CAC GGT GAC CGC 2208 

Gly Arg Gly Val Tyr Glu Phe Thr Gin Asp Val Arg His Gly Asp Arg 
725 730 735 

5 TGG GAC CCC ACG CGA CCA CCC CGT GCG ACG GAC AC A CCA GGC CCA GGC 2256 

Trp Asp Pro Thr Arg Pro Pro Arg Ala Thr Asp Thr Pro Gly Pro Gly 
740 745 750 

CCA GGC AGC CCC CAG CGG CGG GCA CAG CAG AGG GCA GCC CCG GGC GAG 2304 
10 Pro Gly Ser Pro Gin Arg Arg Ala Gin Gin Arg Ala Ala Pro Gly Glu 

755 760 765 

_ CCA GGC TGG ATG GGC CGC CTC TGG GTT ACC TTC AGC GGC AAG CTG CGC 2352 

£g Pro Gly Trp Me-c Gly Arg Leu Trp Val Thr Phe Ser Gly Lys Leu Arc 

15 770 775 780 



f ^ 



m 

wl CGC ATC GTG GAC AGC AAG TAC TTC AGC CGT GGC ATC ATG ATG GCC ATC 2400 



S 20 

it: 



Arg He Val Asp Ser Lys Tyr Phe Ser Arg Gly He Met Met Ala He 

785 790 795 800 

CTT GTC AAC ACG CTG AGC ATG GGC GTG GAG TAC CAT GAG CAG CCC GAG 2448 

Leu Val Asn Thr Leu Ser Met Gly Val Glu Tyr His Glu Gin Pro Glu 
805 810 815 

25 GAG CTG ACT AAT GCT CTG GAG ATC AGC AAC ATC GTG TTC ACC AGC ATG 2496 

Glu Leu Thr Asn Ala Leu Glu lie Ser Asn He Val Phe Thr Ser Met 
820 825 830 

TTT GCC CTG GAG ATG CTG CTG AAG CTG CTG CGC GCT GTC CCT CTG GGC 254 4 

30 Phe Ala Leu Glu Met Leu Leu Lys Leu Leu Arg Ala Val Pro Leu Gly 

835 840 845 

TAC ATC CGG AAC CCG TAC AAC ATC TTC GAC GGC ATC ATC GTG GTC ATC 2592 

Tyr He Arg Asn Pro Tyr Asn He Phe Asp Gly He He Val Val He 
35 850 855 860 



AGC GTC TGG GAG ATC GTG GGG CAG GCG GAC GGT GGC TTG TCT GTG CTG 



2640 
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Ser Val Trp Glu He Val Gly Gin Ala Asp Gly Gly Leu Ser Val Leu 
865 870 875 880 



CGC ACC TTC CGG CTG CTG CGT GTG CTG AAG CTG GTG CGC TTT CTG CCA 2688 
5 Arg Thr Phe Arg Leu Leu Arg Val Leu Lys Leu Val Arg Phe Leu Pro 

885 890 895 



O 



GCC CTG CGG CGC CAG CTC GTG GTG CTG GTG AAG ACC ATG GAC AAC GTG 2736 

Ala Leu Arg Arg Gin Leu Val Val Leu Val Lys Thr Met Asp Asn Val 

10 900 905 ■ 910 

GCT ACC TTC TGC ACG CTG CTC ATG CTC TTC ATT TTC ATC TTC AGC ATC 2784 

Ala Thr Phe Cys Thr Leu Leu Met Leu Phe He Phe He Phe Ser He 

915 920 925 



CTG GGC ATG CAC CTT TTC GGC TGC AAG TTC AGC CTG AAG ACA GAC ACC 2832 



v=i Leu Gly Met His Leu Phe Gly Cys Lys Phe Ser Leu Lys Thr Asp Thr 

^ 930 935 940 

e 

20 GGA GAC ACC GTG CCT GAC AGG AAG AAC TTC GAC TCC CTG CTG TGG GCC 288C 

Q Gly Asp Thr Val Pro Asp Arg Lys Asn Phe Asp Ser Lea Leu Trp Ala 

y = 945 950 955 960 



ATC GTC ACC GTG TTC CAG ATC CTG ACC CAG GAG GAC TGG AAC GTG GTC 2928 
25 ,Ile Val Thr Val Phe Gin He Leu Thr Gin Glu Asp Trp Asn Val Val 

965 970 975 



CTG TAC AAC GGC ATG GCC TCC ACC TCC TCC TGG GCC GCC CTC TAC TTC 297 6 

Leu Tyr Asn Gly Met Ala Ser Thr Ser Ser Trp Ala Ala Leu Tyr Phe 

30 980 985 990 

GTG GCC CTC ATG ACC TTC GGC AAC TAT GTG CTC TTC AAC CTG CTG GTG 3024 

Val Ala Leu Met Thr Phe Gly Asn Tyr Val Leu Phe Asn Leu Leu Val 

995 1000 1005 

35 

GCC ATC CTC GTG GAG GGC TTC CAG GCG GAG GGC GAT GCC AAC AGA TCC 3072 

Ala lie Leu Val Glu Gly Phe Gin Ala Glu Gly Asp Ala Asn Arg Ser 
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1010 1015 1020 



GAC ACG GAC GAG GAC AAG ACG TCG GTC CAC TTC GAG GAG GAC TTC CAC 
Asp Thr Asp Glu Asp Lys Thr Ser Val His Phe Glu Glu Asp Phe His 
1025 1030 1035 1040 

AAG CTC AGA GAA CTC CAG ACC ACA GAG CTG AAG ATG TGT TCC CTG GCC 
Lys Leu Arg Glu Leu Gin Thr Thr Glu Leu Lys Met Cys Ser Leu Ala 
1045 1050 1055 

GTG ACC CCC AAC GGC ACC TGG AGG GAC GAG GCA GCC TGT CCC CTC CCC 
val Thr Pro Asn Gly Thr Trp Arg Asp Glu Ala Ala Cys Pro Leu Pro 
1060 1065 1070 

TCA TCA TGT GCA CAG CTG CCA CGC CCA TGC CTA CCC CCA AGA GCT CAC 
Ser Ser Cys Ala Gin Leu Pro Arg Pro Cys Leu Pro Pro Arg Ala His 
1075 1030 1085 

CAT TCC TGG ATG CAG CCC CCA GCC TCC CAG ACT CTC GGC GTG GCA GCA 
His Ser Trp Met Gin Pro Pro Ala Ser Gin Thr Leu Gly Val Ala Ala 
1090 1095 1100 

GCA GCT CCG GGG ACC CGC CAC TGG GAG ACC AGA AGC CTC CGG CAG CCT 
Ala Ala Pro Gly Thr Arg His Trp Glu Thr Arg Ser Leu Arg Gin Pro 
1105 1110 1115 1120 

CCG AAG TTC TCC CTG TGC CCC CTG GGG CCC AGT GGC GCC TGG AGC AGC 
Pro Lys Phe Ser Leu Cys Pro Leu Gly Pro Ser Gly Ala Trp Ser Ser 
1125 1130 1135 

CGG CGC TCC AGC TGG AGC AGC CTG GGC CGT GCC CAG CCT CAA GCG CCG 
Arg Arg Ser Ser Trp Ser Ser Leu Gly Arg Ala Gin Pro Gin Ala Pro 
1140 1145 1150 

GCG TGC CAG TGT GGG GAA CGT GAG TCC CTG CTG TCT GGC GAG GGC AAG 
Ala Cys Gin Cys Gly Glu Arg Glu Ser Leu Leu Ser Gly Glu Gly Lys 
1155 1160 1165 
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GGC AGC ACC GAC GAC GAA GCT GAG GAC GGC AGG GCG CGC TCC GGG CCC 3552 
Gly Ser Thr Asp Asp Glu Ala Glu Asp Gly Arg Ala Arg Ser Gly Pro 
1170 1175 1180 

5 

CGT GCC ACC CCA CTG CGG CGG GCC GAG TCC CTG GAC CCA CGG CCC CTG 3600 
Arg Ala Thr Pro Leu Arg Arg Ala Glu Ser Leu Asp Pro Arg Pro Leu 
1185 1190 1195 1200 

10 CGG CGG CCG CCT CCC GCC TAC CAA GTG CGC GAT CGC GAC GGG CAG GTG 3648 

Arg Arg Pro Pro Pro Ala Tyr Gin Val Arg Asp Arg Asp Gly Gin Val 
1205 1210 1215 



f£ GTG GCC CTG CCC AGC GAC TTC TTC CTG CGC ATC GAC AGC CAC CGT GAG, 3696 



^ 15 Val Ala Leu Pro Ser Asp Phe Phe Leu Arg lie Asp Ser His Arg Glu 

1220 1225 1230 



i:4 

% GAT GCA GCC GAG CTT GAC GAC GAC TCG GAG GAC AGC TGC TGC CTC CGC 374 4 

s Asp Ala Ala Glu Leu Asp Asp Asp Ser Glu Asp Ser Cys Cys Leu. Arg 
20 1235 1240 1245 



**l CTG CAT AAA GTG CTG GTG CCC TAC AAG CCC CAG CGG TGC CGG AGC AGG 37 92 

•^j Leu His Lys Val Leu Val Pro Tyr Lys Pro Gin Arg Cys Arg Ser Arg 

1250 1255 1260 

25 

AGG CCT GGG CCC TCT ACC CTC TAC CTC TTC TCC CCA CAG AAC CGG TTC 3840 

Arg Pro Gly Pro Ser Thr Leu Tyr Leu Phe Ser Pro Gin Asn Arg Phe 
1265 1270 . 1275 1280 

30 CGC GTC TCC TGC CAG AAG GTC ATC ACA CAC AAG ATG TTT GAT CAC GTG 3888 

Arg Val Ser Cys Gin Lys Val lie Thr His Lys Met Phe Asp His Val 
1285 1290 1295 

GTC CTC GTC TTC ATC TTC CTC AAC TGC GTC ACC ATC GCC CTG GAG AGG 3936 

35 Val Leu Val Phe He Phe Leu Asn Cys Val Thr lie Ala Leu Glu Arg 

1300 1305 1310 
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CCT GAC AT? GAT CCC GGC AGC ACC GAG CGG GTC TTC CTC AGC GTC TCC 3984 
Pro Asp lie Asp Pro Gly Ser Thr Glu Arg Val Phe Leu Ser Val Ser 
1315 1320 1325 

5 AAT TAC ATC TTC ACG GCC ATC TTC GTG GCG GAG ATG ATG GTG AAG GTG 4032 

Asn Tyr He Phe Thr Ala He Phe Val Ala Glu Met Met Val Lys Val 
1330 1335 1340 

GTG GCC CTG GGG CTG CTG TCC GGC GAG CAC GCC TAC CTG CAG AGC AGC 4080 
10 Val Ala Leu Gly Leu Leu Ser Gly Glu His Ala Tyr Leu Gin Ser Ser 

1345 1350 1355 1360 

^ TGG AAC CTG CTG GAT GGG CTG CTG GTG CTG GTG TCC CTG GTG GAC ATT 4128 

%H Trp Asn Leu Leu Asp Gly Leu Leu Val Leu Val Ser Leu Val Asp He 

15 1365 1370 1375 

*J( GTC GTG GCC ATG GCC TCG GCT GGT GGC GCC AAG ATC CTG GGT GTT CTG 4176 

r*- Val Val Ala Met Ala Ser Ala Gly Gly Ala Lys He Leu Gly Val Leu 

* 1380 1385 1390 

m 20 

p CGC GTG CTG CGT CTG CTG CGG ACC CTG CGG CCT CTG AGG GTC ATC AGC 4224 

Arg Val Leu Arg Leu Leu Arg Thr Leu Arg Pro Leu Axg Val He Ser 



1395 1400 1405 

25 CGG CCC CGG CTC AAG CTG GTG GTG GAG ACG CTG ATA TCA TCA CTC AGG 4272 

Arg Pre Arg Leu Lys Leu Val Val Glu Thr Leu He Ser Ser Leu Arg 
1410 1415 1420 

CCC ATT GGG AAC ATC GTC CTC ATC TGC TGC GCC TTC TTC ATC ATT TTT 432C 
30 Pro He Gly Asn He Val Leu He Cys Cys Ala Phe Phe He He Phe 

1425 1430 1435 1440 

GGC ATT TTG GGT GTG CAG CTC TTC AAA GGG AAG TTC TAC TAC TGC GAG 4368 
Gly He Leu Gly Val Gin Leu Phe Lys - Gly Lys Phe Tyr Tyr Cys Glu 
35 1445 1450 1455 



GGC CCC GAC ACC AGG AAC ATC TCC ACC AAG GCA CAG TGC CGG GCC GCC 



4416 




Gly Pro Asp Thr Arg Asn lie Ser Thr Lys Ala Gin Cys Arg Ala Ala 
1460 1465 1470 



CAC TAC CGC TGG GTG CGA CGC AAG TAC AAC TTC GAC AAC CTG GGC GAG 4 464 

5 His Tyr Arg Trp Val Arg Arg Lys Tyr Asn Phe Asp Asn Leu Gly Gin 

1475 1480 1485 

GCC CTG ATG TCG CTG TTC GTG CTG TCA TCC AAG GAT GGA TGG GTG AAC 4 512 

Ala Leu Met Ser Leu Phe Val Leu Ser Ser Lys Asp Gly Trp Val Asn 

10 1490 1495 1500 

ATC ATG TAC GAC GGG CTG GAT GCC GTG GGT GTC GAC CAG CAG CCT GTG 4560 

lie Met Tyr Asp Gly Leu Asp Ala Val Gly Val Asp Gin Gin Pro Val 

£3 1505 1510 1515 1520 

Sl5 



35 



CAG AAC CAC AAC CCC TGG ATG CTG CTG TAC TTC ATC TCC TTC CTC TGC 4 608 

^ Gin Asn His Asn Pro Trp Met Leu Leu Tyr Phe lie Ser Phe Leu Cys 

1525 1530 1535 



~~ 20 TAC ATC GTC AGC TTC TTC GTG CTC AAC ATG TTC GTG GGC GTC GTG GTC 4 656 

O Tyr lie Val Ser Phe Phe Val Leu Asn Met Phe Val Gly Val Val Val 

1540 1545 1550 

GAG AAC TTC CAC AAG TGC CGG CCG CAC CAG GAG GCG GAG GAG GCG CGG 4704 
25 Glu Asn Phe His Lys Cys Arg Pro His Gin Glu Ala Glu Glu Ala Arg 

1555 1560 1565 

CGG CGA GAG GAG AAG CGG CTG CGG CGC CTA GAG AGG AGG CGC AGG AGC , .4 752 

Arg Arg Glu Glu Lys Arg Leu Arg Arg Leu Glu Arg Arg Arg Arg Ser 
30 1570 1575 1580 

ACT TTC CCC AGC CCA GAG GCC CAG CGC CGG CCC TAC TAT GCC GAC TAC 4800 
Thr Phe Pro Ser Pro Glu Ala Gin Arg Arg Pro Tyr Tyr Ala Asp Tyr 
1585 1590 1595 1600 



TCG CCC ACG CGC CGC CGC TCC, ATT CAC TCG CTG TGC ACC AGC CAC TAT 484 8 

Ser Pro Thr Arg Arg Arg Ser lie His Ser Leu Cys Thr Ser His Tyr 



1605 



31 

1610 



1615 



10 



1 is 


AAG 






Li 


Lys 


m 


1665 






LS 






CAG 



30 



CTC GAC CTC TTC ATC ACC TTC ATC ATC TGT GTC AAC GTC ATC ACC ATG 4896 
Leu Asp Leu Phe lie Thr ?he lie He Cys Val Asn Val He Thr Met 
1620 1625 1630 

TCC ATG GAG CAC TAT AAC CAA CCC AAG TCG CTG GAC GAG GCC CTC AAG 4 94 4 

Ser Met Glu His Tyr Asn Gin Pro Lys Ser Leu Asp Glu Ala I^eu Lys 
1635 1640 1645 

TAC TGC AAC TAC GTC TTC ACC ATC GTG TTT GTC TTC GAG GCT GCA CTG 4992 
Tyr Cys Asn Tyr Val Phe Thr He Val Phe Val Phe Glu Ala Ala Leu 
1650 1655 1660 



5040 



1670 1675 1680 



GAC CTG GCC ATC GTG CTG CTG TCA CTC ATG GGC ATC ACG CTG 5088 

jTj 20 Gin Leu Asp Leu Ala He Val Leu Leu Ser Leu Met Gly He Thr Leu 

Q 1685 1690 1695 

iti 

K£ GAG GAG ATA GAG ATG AGC GCC GCG CTG CCC ATC AAC CCC ACC ATC ATC 5136 

Glu Glu He Glu Met Ser Ala Ala Leu Pro He Asn Pro Thr He He 
25 1700 1705 1710 



CGC ATC ATG CGC GTG CTT CGC ATT GCC CGT GTG CTG AAG CTG CTG AAG 5184 
Arg He Met Arg Val Leu Arg He Ala Arg Val Leu Lys Leu Leu Lys 
1715 1720 1725 



ATG GCT ACG GGC ATG CGC GCC CTG CTG GAC ACT GTG GTG CAA GCT CTC 5232 
Met Ala Thr Gly Met Arg Ala Leu Leu Asp Thr Val Val Gin Ala Leu 
1730 1735 1740 

35 CCC CAG GTG GGG AAC CTG GGC CTT CTT TTC ATG CTC CTG TTT TTT ATC 5280 

Pro Gin Val Gly Asn Leu Gly Leu Leu Phe Met Leu Leu Phe Phe He 
1745 1750 1755 1760 



32 



TAT CTG AGA TTG GGA GTG GAG CTG TTC GGG AGG CTG GAG TGC AGT GAA 
Tyr Leu Arg Leu Gly Val Giu Leu Phe Gly Arg Leu Glu Cys Ser Glu 
1765 1770 1775 

GAC AAC CCC TGC GAG GGC CTG AGC AGG CAC GCC ACC TTC AGC AAC TTC 
Asp Asn Pro Cys Glu Gly Leu Ser Arg His Ala Thr Phe Ser Asn Phe 
1780 1785 1790 

GGC ATG GCC TTC CTC ACG CTG TTC CGC GTG TCC ACG GGG GAC AAC TGG 
Gly Met Ala Phe Leu Thr Leu Phe Arg Val Ser Thr Gly Asp Asn Trp 
1795 1800 1805 

AAC GGG ATC ATG AAG GAC ACG CTG CGC GAG TGC TCC CGT GAG GAC AAG - 
Asn Gly lie Met Lys Asp Thr Leu Arg Glu Cys Ser Arg Glu Asp Lys 
1810 1815 1820 

CAC TGC CTG AGC TAC CTG CCG GCC CCG TCG CCC GTC TAC TTC GTG ACC 
His Cys Leu Ser Tyr Leu Pro Ala Pro Ser Pro Val Tyr Phe Val Thr 
1825 1830 1835 1840 

TTC GTG CTG GTG CCC CAG TTC GTG CTG GTG AAC GTG GTG GTG GCC GTG 
Phe Val Leu Val Pro Gin Phe Val Leu Val Asn Val Val Val Ala Val 
1845 1850 1855 

CTC ATG AAG CAC CTG GAG GAG AGC AAC AAG GAG GCT CGG GAG GAT GCG 
Leu Met Lys His Leu Glu Glu Ser Asn Lys Glu Ala Arg Glu Asp Ala 
1860 1865 1B70 

GAG CTG GAC GCC GAG ATC GAG CTG GAG ATG GCG CAG GGC CCC GGG AGT 
Glu Leu Asp Ala Glu lie Glu Leu Glu Met Ala Gin Gly Pro Gly Ser 
1875 1880 1885 



GCA CGC CGG GTG GAC GCG GAC AGG CCT CCC TTG CCC CAG GAG AGT CCG 
Ala Arg Arg Val Asp Ala Asp Arg Pro Pro Leu Pro Gin Glu Ser Pro 
1890 1895 1900 



33 

GCG CCA GGG ACG CCC CAA ACC TGG TTG CAC GCA AGG TGT CCG TGT CCA 
Ala Pro Gly Thr Pro Gin Thr Trp Leu His Ala Arg Cys Pro Cys Pro 
1905 1910 1915 1920 

GGA TCT CTC GCT GCC CAA CGA CAG CTA CAT GTT CAG GCC CGT GGT GCC 
Gly Ser Leu Ala Ala Gin Arg Gin Leu His Val Gin Ala Arg Gly Ala 
1925 1930 1935 

TGC CTC GGC GCC CCG GGC CCG CCC GCT GCA GGA GGT GGA GAT GGA GAC 
Cys Leu Gly Ala Pro Gly Pro Pro Ala Ala Gly Gly Gly Asp Gly Asp 
1940 1945 1950 

CTA TGG GGC CGG CAC CCC CTT GGA GTC CTG TGC CAT CCC ATC CAG ATC 
Leu Trp Gly Arg His Pro Leu Gly Val Leu Cys Kis Pro lie Gin lie 
1955 1960 1965 

CCA TTG GCT GTG TCG AAC CCA GCC AGG AGC GGC GAG CCC CTC CAC GCC 
Pro Leu Ala Val Ser Asn Pro Ala Arg Ser Gly Glu Pro Leu His Ala 
1970 1975 1980 

CTG TCC CCT CGG GGC ACA GCC GCT CCC CCA GTC TCA GCC GGC TGC TCT 
Leu Ser Pro Arg Gly Thr Ala Ala Pro Pro Val Ser Ala Gly Cys Ser 
1985 1990 1995 2000 

GCA GAC AGG AGG CTG TGC ACA CCG ATT CCT TGG AAG GGA AGA TTG ACA 
Ala Asp Arg Arg Leu Cys Thr Pro He Pro Trp Lys Gly Arg Leu Thr 
2005 2010 2015 

GCC CTA GGG ACA CCC TGG ATC CTG CAG AGC CTG GTG AGA AAC CCC CGG 
Ala Leu Gly Thr Pro Trp He Leu Gin Ser Leu Val Arg Asn Pro Arg 
2020 2025 2030 



(2) INFORMATION FOR SEQ ID NO: 2: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 720 base pairs 




34 



(B) TYPE: nucleic acid 
fC) STRANDEDNESS: single 
(D) TOPOLOGY: unknown 

5 "(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

10 GGCGGTGACC GCGCCGCCCG GCGATGCCCG CGGGGACGCC GCCGGCCAGC AGAGCAGGTG 60 

CTGCCGGCCG CCACCATGAC CGAGGGCGCA CGGGCCGCCG ACGAGGTCCG GGTGCCCCTG 120 

yj GGGCGCCGCC CCTGGCCCTG CGGCGTTGG- GGGGGCGTCC CCGGAGAGCC CCGGGGCGCC 180 

GGGACGCGAG GCGGAGGGGG GTTCGAGCTC GGCGTGTCAC CCTCCGAGAG CCCGGCGGCC 240 

/J GAGCGCTGCG CGGAGCTGGS TGCCGACGAG GAGCAGCGCG TCCCGTACCC GGCCTTGGCG 30O 

^20 GCCACGGTCT TCTTCTGCCT CGGTCAGACC ACGCGGCCGC GCAGCTGGTC CGTCCGGCTG 360 

Wc GTCTGCAACC CATGGTTCGA GCACGTGAGC ATGCTGGTAA TCATGCTCAA CTGCGTGACC 420 

CTGGGCATGT TCCGGCCCTG TGAGGACGTT GAGTGCGGCT CCGAGCGCTG CAACATCCTG 4 80 

25 

GAGGCCTTTG ACGCCTTCAT TTTCGCCTTT TTTGCGGTGG AGATGGTCAT CAAGATGGTG 540 

GCCTTGGGGC TGTTCGGGCA GAAGTGTTAC CTGGGTGACA CGTGGAACAG GCTGGATTTC 600 

30 TTCATCGTCG TGGCGGGCAT GATGGAGTAC TCGTTGGACG GACACAACGT GAGCCTCTCG 660 

GCTATCAGGA CCGTGCGGGT GCTGCGGCCC CTCCGCGCCA TCAACCGCGT GCCTAGCATG 720 

CGGATCCTGG TCACTCTGCT GCTGGATACG CTGCCCATGC TCGGGAACGT CCTTCTGCTG 780 

35 

TGCTTCTTCG TCTTCTTCAT TTTCGGCATC GTTGGCGTCC AGCTCTGGGC TGGCCTCCTG 840 



10 



35 

CGGAACCGCT GCTTCCTGGA CAGTGCCTTT GTCAGGAACA ACAACCTGAC CTTCCTGCGG 900 

CCGTACTACC AGACGGAGGA GGGCGAGGAG AACCCGTTCA TCTGCTCCTC ACGCCGAGAC 960 

5 AACGGCATGC AGAAGTGCTC GCACATCCCC GGCCGCCGCG ACGTGCGCAT GCCCTGCACC 1020 

CTGGGCTGGS AGGCCTACAC GCAGCCGCAG GCCGAGGGGG TGGGCGCTGC ACGCAACGCC 1080 

TGCATCAACT GGAACCAGTA CTACAACGTG TGCCGCTCGG GTGACTCCAA CCCCCACAAC 114 0 

GGTGCCATCA ACTTCGACAA CACCTGCTAC GCCTGGATTG CCATCTTCCA GGTGATCACG 1200 

ftt CTGGAAGGCT GGGTGGACAT CATGTACTAC GTCATGGACG CCCACTCATT CTACAACTTC 1260 

^ 15 ATCTATTTCA TCCTGCTCAT CATCGTGGGC TCCTTCTTCA TGATCAACCT GTGCCTGGTG 1320 

iM GTGATTGCCA CGCAGTTCTC GGAGACGAAG CAGCGGGAGA GTCAGCTGAT GCGGGAGCAG 1380 

CGGGCACGCC ACCTGTCCAA CGACAGCACG CTGGCCAGCT TCTCCGAGCC TGGCAGCTGC 14 40 

TACGAAGAGC TGCTGAAGTA CGTGGGCCAC ATATTCCGCA AGGTCAAGCG GCAGCTTGCG 1500 

CCTCTACGCC CGCTGGCAGA GCCGTGGCGC AAGAAGGTGG ACCCCAGTGC TGTGCAAGGC 1560 

25 CAGGGTCCCG GGCACCGCCA GCGCCGGGCA GGCAGGCACA CAGCCTCGGT GCACCACCTG 1620 

GTCTACCACC ACCATCACCA CCACCACCAC CACTACCATT TCAGCCATGG CAGCCCCCGC 1680 

AGGCCCGGCC CCGAGCCAGG CGCCTGCGAC ACCAGGCTGG TCCGAGCTC-G CGCGCCCCCC 1740 

30 

TCC-CCACCTT CCCCAGGCCG CGGACCCCCC GACGCAGAGT CTGTGCACAG CAT CT AC CAT 1800 

GCCGACTGCC ACATAGAGGG GCCGCAGGAG AGGGCCCGGG TGGGCACATG CCGCAGCCAC 1860 

35 TGCCGCTGCC AGCCTCAGGC TGGCCACAGG GCTGGGCACC ATGAACTACC CCACGATCCT 1920 

GCCCTCAGGG GTGGGCAGCG GCAAAGGCAG CACCAGCCCC GGACCCAAGG GGAAGTGGGC 1980 



£ 20 



36 

CGGTGGACCG CCAGGCACCG GGGGCACGGC CCGTTGAGC7 TGAACAGCCC TGATCCCTAC 2040 

GAGAAGATCC CGCATGTGGC CGGGGAGCAT GGACTGGCCA GCCCTGGCCA TCTGTCGGGC 2100 

5 

CTCAGTGTGC CCTGCCCCCT GCCCAGCCCG CCAGCGGGCA CACTGACCTG TGAGCTGAAG 2160 

AGCTGCCCGT ACTGCACCCG TGCCCTGGAG GACCCGGAGG GTGAGCTCAG CGGCTCGGAA 2220 

10 AGTGGAGACT CAGATGGCCG TGGCGTCTA? GAATTCACGC AGGACGTCCG GCACGGTGAC 2280 

CGCTGGGACC CCACGCGACC ACCCCGTGCG ACGGACACAC CAGGCCCAGG CCCAGGCAGC 2340 

W 

fft CCCCAGCGGC GGGCACAGCA GAGGGCAGCC CCGGGCGAGC CAGGCTGGAT GGGCCGCCTC 2400 



15 

i3 



25 



35 



TGGGTTACCT TCAGCGGCAA GCTGCGCCGC ATCGTGGACA GCAAGTACTT CAGCCGTGGC 24 60 



ATCATGATGG CCATCCTTGT CAACACGCTG AGCATGGGCG TGGAGTACCA TGAGCAGCCC 2520 



p20 GAGGAGCTGA CTAATGCTCT GGAGATCAGC AACATCGrGT TCACCAGCAT GTTTGCCCTG 2580 

Q 

Irl GAGATGCTGC TGAAGCTGCT GCGCGCTGTC CCTCTGGGCT ACATCCGGAA CCCGTACAAC 2640 



ATCTTCGACG GCATCATCGT GGTCATCAGC GTCTGGGAGA TCGTGGGGCA GGCGGACGGT 2700 



GGCTTGTCTG TGCTGCGCAC CTTCCGGCTG CTGCGTGTGC TGAAGCTGGT GCGCTTTCTG 2760 



CCAGCCCTGC GGCGCCAGCT CGTGGTGCTG GTGAAGACCA TGGACAACGT GGCTACCTTC 2820 



30 TGCACGCTGC TCATGCTCTT CATTTTCATC TTCAGCATCC TGGGCATGCA CCTTTTCGGC 2880 



TGCAAGTTCA GCCTGAAGAC AGACACCGGA GACACCGTGC CTGACAGGAA GAACTTCGAC 2 940 



TCCCTGCTGT GGGCCATCGT CACCGTGTTC CAGATCCTGA CCCAGGAGGA CTGGAACGTG 3000 



GTCCTGTACA ACGGCATGGC CTCCACCTCC TCCTGGGCCG CCCTCTACTT CGTGGCCCTC 3060 



37 

ATGACCTTCG GCAACTATGT GCTCTTCAAC CTGCTGGTGG CCATCCTCGT GGAGGGCTTC 3120 

CAGGCGGAGG GCGATGCCAA CAGATCCGAC ACGGACGAGG ACAAGACG7C GGTCCACTTC 3180 

5 GAGGAGGACT TCCACAAGCT CAGAGMCTC CAGACCACAG AGCTGAAGAT GTGTTCCCTG 3240 

GCCGTGACCC CCAACGGCAC CTGGAGGGAC GAGGCAGCCT GTCCCCTCCC CTCATCATGT 3300 



GCACAGCTGC CACGCCCATG CCTACCCCCA AGAGCTCACC ATTCCTGGAT GCAGCCCCCA 3360 

10 

GCCTCCCAGA CTCTCGGCGT GGCAGCAGCA GCTCCGGGGA CCCGCCACTG GGAGACCAGA 3420 



_ AGCCTCCGGC AGCCTCCGAA GTTCTCCCTG TGCCCCCTGG GGCCCAGTGG CGCCTGGAGC 3480 

%a ■ 

^15 AGCCGGCGCT CCAGCTGGAG CAGCCTGGGC CGTGCCCAGC CTCAAGCGCC GGCGTGCCAG 3540 



TGTGGGGAAC GTGAGTCCCT GCTGTCTGGC GAGGGCAAGG GCAGCACCGA CGACGAAGCT 3600 

s GAGGACGGCA GGGCGCGCTC CGGGCCCCGT GCCACCCCAC TGCGGCGGGC CGAGTCCCTG 3660 

^20 

K GACCCACGGC CCCTGCGGCG GCCGCCTCCC GCCTACCAAG TGCGCGATCG CGACGGGCAG 3720 



in 



30 



GTGGTGGCCC TGCCCAGCGA CTTCTTCCTG CGCATCGACA GCCACCGTGA GGATGCAGCC 3780 



25 GAGCTTGACG ACGACTCGGA GGACAGCTGC TGCCTCCGCC TGCATAAAGT GCTGGTGCCC 3840 



TACAAGCCCC AGCGGTGCCG GAGCAGGAGG CCTGGGCCCT CTACCCTCTA CCTCTTCTCC 3900 



;CACAGAACC GGTTCCGCGT CTCCTGCCAG AAGGTCATCA CACACAAGAT GTTTGATCAC 3960 



GTGGTCCTCG TCTTCATCTT CCTCAACTGC GTCACCATCG CCCTGGAGAG GCCTGACATT 4020 



GATCCCGGCA GCACCGAGCG GGTCT?CCTC AGCGTCTCCA AC7TACATCTT CACGGCCATC 4080 
35 TTCGTGGCGG AGATGATGGT GAAGGTGGTG GCCCTGGGGC TGCTGTCCGG CGAGCACGCC 4140 

TACCTGCAGA GCAGCTGGAA CCTGCTGGAT GGGCTGCTGG TGCTGGTGTC CCTGGTGGAC 4200 



38 



ATTGTCG7GG CCATGGCCTC GGCTGGTGGC GCCAAGATCC TGGGTGTTCT GCGCGTGCTG 426C 

CGTCTGCTGC GGACCCTGCG GCCTCTGAGG GTCATCAGCC GGCCCCGGCT CAAGCTGGTG 4320 

5 

GTGGAGACGC T GAT AT CATC ACTCAGGCCC ATTGGGAACA TCGTCCTCAT CTGCTGCGCC 4380 

TTCTTCATCA TTTTTGGCAT TTTGGGTGTG CAGCTCTTCA AAGGGAAGTT CTACTACTGC 44 40 

10 GAGGGCCCCG ACACCAGGAA CATCTCCACC AAGGCACAGT GCCGGGCCGC CCACTACCGC 4500 
TGGGTGCGAC GCAAGTACAA CTTCGACAAC C7GGGCCAGG CCCTGATGTC GCTGTTCGTG . 4 560 

CTGTCATCCA AGGATGGATG GGTGAACATC ATGTACGACG GGCTGGATGC CGTGGGTGTC 4 620 

£ 15 

GACCAGCAGC CTGTGCAGAA CCACAACCCC TGGATGCTGC TG TACT T CAT CTCCTTCCTC 4 680 

3 TGCTACATCG TCAGCTTCTT CGTGCTCAAC ATGTTCGTGG GCGTCGTGGT CGAGAACTTC 4 740 

^ 20 CACAAGTGCC GGCCGCACCA GGAGGCGGAG GAGGCGCGGC GGCGAGAGGA GAAGCGGCTG 4 800 
i 

i i 

^ CGGCGCCTAG AGAGGAGGCG CAGGAGCACT TTCCCCAGCC CAGAGGCCCA GCGCCGGCCC 4 860 

Vl 

TACTATGCCG ACTACTCGCC CACGCGCCGC CGCTCCATTC ACTCGCTGTG CACCAGCCAC 4920 

25 

TATCTCGACC TCTTCATCAC CTTCATCATC TGTGTCAACG TCATCACCAT GTCCATGGAG 4 980 

CACTATAACC AACCCAAGTC GCTGGACGAG GCCCTCAAGT ACTGCAACTA CGTCTTCACC 5040 

30 ATCGTGTTTG TCTTCGAGGC TGCACTGAAG CTGGTAGCAT TTGGGTTCCG TCGGTTCTTC 5100 

AAGGACAGGT GGAACCAGCT GGACCTGGCC ATCGTGCTGC TGTCACTCAT GGGCATCACG 5160 

CTGGAGGAGA TAGAGATGAG CGCCGCGCTG CCCATCAACC CCACCATCAT CCGCATCATG 5220 

35 

CGCGTGCTTC GCATTGCCCG TGTGCTGAAG CTGCTGAAGA TGGCTACGGG CATGCGCGCC 5280 



% 39 

CTGCTGGACA CTGTGGTGCA AGCTCTCCCC CAGGTGGGGA ACCTGGGCCT TCTTTTCATG 5340 

CTCCTGTTTT TTATCTATCT GAGATTGGGA GTGGAGCTGT TCGGGAGGCT GGAGTGCAGT 5400 

5 GAAGACAACC CCTGCGAGGG CCTGAGCAGG CACGCCACCT TCAGCAACTT CGGCATGGCC 54 60 

TTCCTCACGC TGTTCCGCGT GTCCACGGGG GACAACTGGA ACGC-GATCAT GAAGGACACG 5520 

CTGCGCGAGT GCTCCCGTGA GGACAAGCAC TGCCTGAGCT ACCTGCCGGC CCCGTCGCCC 558C 

10 

GTCTACTTCG TGACCTTCGT GCTGGTGCCC CAGTTCGTGC TGGTGAACGT GGTGGTGGCC 5640 

GTGCTCATGA AGCACCTGGA GGAGAGCAAC AAGGAGGCTC GGGAGGATGC GGAGCTGGAC 5700 

p 

*3 15 GCCGAGATCG AGCTGGAGAT GGCGCAGGGC CCCGGGAGTG CACGCCGGGT GGACGCGGAC 57 60 

C3 AGGCCTCCCT TGCCCCAGGA GAGTCCGGCG CCAGGGACGC CCCAAACCTG GTTGCACGCA 5820 

s 

AGGTGTCCGT GTCCAGGATC TCTCGCTGCC CAACGACAGC TACATGTTCA GGCCCGTGGT 58 8C 

^20 

\* 

fj GCCTGCCTCG GCGCCCCGGG CCCGCCCGCT GCAGGAGGTG GAGATGGAGA CCTATGGGGC 5940 
i h 

*d 

rU CGGCACCCCC TTGGAGTCCT GTGCCATCCC ATCCAGATCC CATTGGCTGT GTCGAACCCA 6000 

25 GCCAGGAGCG GCGAGCCCCT CCACGCCCTG TCCCCTCGGG GCACAGCCGC TCCCCCAGTC 6060 

7CAGCCGGCT GCTCTGCAGA CAGGAGGCTG TGCACACCGA TTCCTTGGAA GGGAAGATTG 6120 

ACAGCCCTAG GGACACCCTG GATCCTGCAG AGCCTGGTGA GAAACCCCCG GTGAGGCCGG 61 BO 

30 

TGACCCAGGG GGGCTCCCTG CAGTCCCCAC CACGCTCCCC ACGGCCCGCC AGCGTCCGCA 6240 

CTCGTAAGCA TACCTTCGGA CAGCGCTGCG TCTCCAGCCG GCCGGCGGCC CCAGGCGGAG 6300 

35 AGGAGGCCGA GGCCTCGGAC CCAGCCGACG AGGAGGTCAG CCACATCACC A6CTCCGCCT 6360 

GCCCCTGGCA GCCCACAGCC GAGCCCCATG GCCCCGAAGC CTCTCCGGTG GCCGGCGGCG 6420 



40 



AGCGGGACCT GCGCAGGCTC TACAGCGTGG ATGCTCAGGG CTTCCTGGAC AAGCCGGGCC 6480 



GGGCAGACGA GCAGTGGCTG CCCTCGGGGA GTGGGCAGCG GGGAGCCTGG GGAGGCGAAG 6540 



GCCTGGGGCC TGAGGCCGAG CCCGCTCTGG GTGCGCGCAG AAAGAAGAAG ATGAGCCCCC 6600 



CCTGCATCTC GGTGGAACCC CCTGCGGAGG ACGAGGGCTC TGCGCGGCCC TCCGCGGCAG 6660 



10 AGGGCGGCAG ACCACACTGA GGCTCAGGAC CCCGTCCTGT GAGGCCACGC CTCACAGGGA 6720 



CTCCCTGGAG CCCACAGAGG GCTCAGGCGC CGGGGGGGAC CCTGCAGCCA AGGGGGAGCG 6780 



15 



1'.£ 
03 
□ 



'"■3 



25 



35 



CTGGGGCCAG GCCTCCTGCC GGGCTGAGCA CCTGACCGTC CCCAGCTTTG CCTTTGAGCC 6840 



GCTGGACCTC GGGGTCCCCA GTGGAGACCC TPTCTTGGAC GGTAGCCACA GTGTGACCCC 6900 



AGAArCCAGA GCTTCCTCTT CAGGGGCCAT AGTGCCCCTG GAACCCCCAG AATCAGAGCC 6960 



20 TCCCATGCCC GTCGGTGACC CCCCAGAGAA GAGGCGGGGG CTGTACCTCA CAGTCCCCCA 7020 

1 

! -1 GTGTCCTCTG GAGAAACCAG GGTCCCCCTC AGCCACCCCT GCCCCAGGGG GTGGTGCAGA 70S0 

TGACCCCGTG TAGCTCGGGG CTTGGTGCCG CCCACGGCTT TGGCCCTGGG GTCTGGGGGC 7140 

CCGCTGGGGT GGAGGCCCAG GCAGAACCCT C-CATGGACCC TGACTTGGGT CCCGTCGTGA 7200 

GCAGAAAGGC CCGGGGAGGA TGACGGCCCA GGCCCTGGTT CTCTGCCCAG CGAAGCAGGA 7260 

30 GTAGCTGCCG GGCCCCCACG AGCCTCCGTC CGTTCTGGTT CGGGTTTCTC CGAGTTTTGC 7320 

TACCAGCCGA GGCTGTGCGG GCAACTGGGT CAGCCTCCCG TCAGGAGAGA AGCCGCGTCT 7380 

GTGGGACGAA GACCGGGCAC CCGCCAGAGA GGGGAATGG? ACCAGGTTGC GTCCTTTCAG 744 0 

GCCCCGCGTT GTTACAGGAT CATCTCGCTG GGGGCCC7GT GCCTCTTGCC GGCGGCAGGT 7 500 



io 



e 

M- 20 
PJ 



41 

TGCATGCCAC CGCGGCCCGA ATGTCACCTT CACTCACAGT CTGAGTTCTT GTCCGCCTGT 7560 

CACGCCCTCA CCACCCTCCC CTTCCAGCCA CCACCCTTTC CGTTCCGCTC GGGCCTTCCC 7620 

AGAAGCGTCC TGTGACTCTG GGAGAGGTGA CACCTCACTA AGGGGCCGAC CCCATGGAGT 7 680 

AACGCGCCCG GCCCCGATGC GAATCAGGCC TCCCCCTCCG 7720 



(2) INFORMATION FOR SEQ ID NO: 3: 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 6858 base pairs 
C 15 (B) TYPE: nucleic acid 

^ (C) STRANDEDNESS: single 

\Q (D) TOPOLOGY: unknown 



(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

25 ATG CTC CCC CAC CGG GTC CCC CGT TGC GTG AGG ACA CCT CCT CTG AGG 48 

Met Leu Pro His Arg Val Pro Arg Cys Val Arg Thr Pro Pro Leu Arg 
2035 2040 2045 

GGC TCC GCT CGC CCC ?C7 TCG GAC CCC CCG GGG CCC CGG CTG GCC AG A 96 
30 Gly Ser Ala Arg Pro Ser Ser Asp Pro Pro Gly Pro Arg Leu Ala Arg 

2050 2055 2060 

GGA TGG ACG AGG AGG AGG ATG GAG CGG GCG CCG AGG AGT CGG GAC AGC 144 
Gly Trp Thr Arg Arg Arg Met Glu Arg Ala Pro Arg Ser Arg Asp Ser 
35 2065 2070 2075 2080 • 



CCC GTA GCT TCA CGC AGC 7CA ACG ACC TGT CCG GGG CCG GGG GCG GCA 



192 



Pro Val Ala Ser Arg Ser 
2085 



Ser 



42 

Thr Thr Cys Pro Gly 
2090 



Pro Gly Ala Ala 
2095 



GGG GCC GGG TCG ACG GAA AAG GAC CCG GGC AGC GCG GAC TCC GAG GCG 240 
5 Gly Ala Gly Ser Thr Glu Lys Asp Pro Gly Ser Ala Asp Ser Glu Ala 

2100 2105 2110 

GAG GGG CTG CCG TAC CCG GCG CTA GCC CCG GTG GTT TTC TTC TAC TTG 288 
Glu Gly Leu Pro Tyr Pro Ala Leu Ala Pro Val Val Phe Phe Tyr Leu 
10 2115 2120 2125 

AGC CAG GAC AGC CGC CCG CGG AGC TGG TGT CTC CGC ACG GTC TGT AAC 336 
Ser Gin Asp Ser Arg Pro Arg Ser Trp Cys Leu Arg Thr Val Cys Asn 

u 

<g 2130 2135 2140 

£ 15 

I?? CCG TGG TTC GAG CGA GTC AGT ATG CTG GTC ATT CTT CTC AAC ?GT GTG 384 

£9 Pro Trp Phe Glu Arg Val Ser Met Leu Val lie Leu Leu Asn Cys Val 

H 2145 2150 2155 2160 

20 ACT CTG GGT ATG TTC AGG CCG TGT GAG GAC ATT GCC TGT GAC TCC CAG 432 

p Thr Leu Gly Met Phe Arg Pro Cys Glu Asp lie Ala Cys Asp Ser Gin 

15 2165 2170 2175 

SI 

CGC TGC CGG ATC CTG CAG GCC TTC GAT GAC TTC ATC TTT GCC TTC TTT 480 

25 Arg Cys Arg Tie Leu Gin Ala Phe Asp Asp Phe He Phe Ala Phe Phe 

2180 2185 2190 



GCT GTG GAA ATG GTG GTG AAG ATG GTG GCC TTG GGC ATC TTT GGG AAG 528 

Ala Val Glu Met Val Val Lys Met Val Ala Leu Gly He Phe Gly Lys 

30 2195 2200 , 2205 

AAA TGT TAC CTG GGA GAC ACT TGG AAC CGG CTT GAC TTT TTC ATT GTC 57 6 

Lys Cys Tyr Leu Gly Asp Thr Trp Asn Arg Leu Asp Phe Phe He Val 

2210 2215 2220 

35 

ATT GCA GGG ATG CTG GAG TAT TCG CTG GAC CTG CAG AAC GTC AGC TTC 624 

He Ala Gly Met Leu Glu Tyr Ser Leu Asp Leu Gin Asn Val Ser Phe 



43 

2225 2230 2235 2240 

TCC GOV GTC AGG ACA GTC CGT GTG CTG CGA CCG CTC AGG GCC ATT AAC 672 
Ser Ala Val Arg Thr Val Arg Val Leu Arg Pro Leu Arg Ala lie Asn 
2245 2250 2255 

CGG GTG CCC AGC ATG CGC ATT CTC GTC ACA TTA CTG CTG GAC ACC TTG 720 
Arg Val Pro Ser Met Arg He Leu Val Thr Leu Leu Leu Asp Thr Leu 
2260 2265 2270 

CCT ATG CTG GGC AAC GTC CTG CTG CTC TGT TTC TTC GTC TTT TTC ATC 768 
Pro Met Leu Gly Asn Val Leu Leu Leu Cys Phe Phe Val Phe Phe He 
^ 2275 2280 2285 

jTTi 

: -J3l5 TTT GGC ATC GTG GGC GTC CAG CTG TGG GCA GGA CTG CTT CGC AAC CGG 816 

{= ?he Gly He Val Gly Val Gin Leu Trp Ala Gly Leu Leu Arg Asn Arg 

p 2290 2295 2300 



10 



TGC TTC CTC CCC GAG AAC TTC AGC C?C CCC CTG AGC GTG GAC CTG GAG 864 



^20 Cys Phe Leu Pro Glu Asn Phe Ser Leu Pro Leu Ser Val Asp Leu Glu 

IV 

2305 2310 2315 2320 



CCT TAT TAC CAG ACA GAG AAT GAG GAC GAG AGC CCC TTC ATC TGC TCT 912 
Pro Tyr Tyr Gin Thr Glu Asn Glu Asp Glu Ser Pro Phe He Cys Ser 
25 2325 2330 2335 

CAG CCT CGG GAG AAT GGC ATG AGA TCC TGC AGG AGT GTG CCC ACA CTG 960 

Gin Pro Arg Glu Asn Gly Met Arg Ser Cys Arg Ser Val Pro Thr Leu 
2340 2345 2350 

30 

CGT GGG GAA GGC GGT GGT GGC CCA CCC TGC AGT CTG GAC TAT GAG ACC 1008 

Arg Gly Glu Gly Gly Gly Gly Pro Pro Cys Ser Leu Asp Tyr Glu Thr 
2355 2360 2365 

35 TAT AAC AGT TCC AGC AAC ACC ACC TGT GTC AAC TGG AAC CAG TAC TAT 1056 

Tyr Asn Ser Ser Ser Asn Thr Thr Cys Val Asn Trp Asn Gin Tyr Tyr 
2370 2375 2380 



44 



ACC AAC TGC TCT GCG GGC GAG CAC AAC CCC TTC AAA GGC GCC ATC AAC 
Thr Asn Cys Ser Ala Gly Glu His Asn Pro Phe Lys Gly Ala lie Asn 
2385 2390 2395 2400 



1104 



TTT GAC AAC ATT GGC TAT GCC TGG ATC GCC ATC TTC CAG GTC ATC ACA 
Phe Asp Asn lie Gly Tyr Ala Trp lie Ala He Phe Gin Val He Thr 
2405 2410 2415 



1152 



10 CTG GAG GGC TGG GTC GAC ATC ATG TAC TTC GTA ATG GAC GCT CAC TCC 

Leu Glu Gly Trp Val Asp He Met Tyr Phe Val Met Asp Ala His Ser 
2420 2425 2430 



1200 



.=1 

as 



TTC TAC AAC TTC ATC TAC TTC ATT CTT CTC ATC ATC GTG GGC TCC TTC 
15 Phe Tyr Asn Phe He Tyr Phe He Leu Leu He He Val Gly Ser Phe 

2435 2440 2445 



1248 



J- 20 

ft I 



25 



TTC ATG ATC AAC CTG TGC CTG GTG GTG ATT GCC ACG CAG TTC TCC GAG 12 96 

Phe Met He Asn Leu Cys Leu Val Val He Ala Thr Gin Phe Ser Glu 
2450 2455 2460 

ACC AAA CAG CGG GAG AGT CAG CTG ATG CGG GAG CAG CGT GTA CGA TTC 1344 
Thr Lys Gin Arg Glu Ser Gin Leu Met Arg Glu Gin Arg Val Arg Phe 
2465 2470 2475 2480 

CTG TCC AAT GCT AGC ACC CTG GCA AGC TTC TCT GAG CCA GGC AGC TGC 1392 
Leu Ser Asn Ala Ser Thr Leu Ala Ser Phe Ser Glu Pro Gly Ser Cys 
2485 2490 2495 



30 TAT GAG GAG CTA CTC AAG TAC CTG GTG TAC ATC CTC CGA AAA GCA GCC 

Tyr Glu Glu Leu Leu Lys Tyr Leu Val Tyr Tie Leu Arg Lys Ala Ala 
2500 2505 2510 



1440 



35 



CGA AGG 
Arg Arg 



CTG GCC CAG GTC TCT AGG GCT ATA GGC GTG CGG GCT GGG CTG 
Leu Ala Gin Val Ser Arg Ala He Gly Val Arg Ala Gly Leu 
2515 2520 2525 



1488 



45 

CTC AGC AGC CCA GTG GCC CGT AGT GGG CAG GAG CCC CAG CCC AGT GGC 
Leu Ser Ser Pro Val Ala Arg Ser Gly Gin Glu Pro Gin Pro Ser Gly 
2530 2535 2540 



1536 



AGC TGC ACT CGC TCA CAC CGT CGI CTG TCT GTC CAC CAC CTG GTC CAC 
Ser Cys Thr Arg Ser His Arg Arg Leu Ser Val His His Leu Val His 
2545 2550 2555 2560 



1584 



CAC CAT CAC CAC CAC CAT CAC CAC TAC CAC CTG GGT AAT GGG ACG CTC 
10 His His His His His His His His Tyr His Leu Gly Asn Gly Thr Leu 

2565 2570 2575 



1632 



£15 

Mb? 
I™' 



^20 



hi i 



AGA GTT CCC CGG GCC AGC CCA GAG ATC CAG GAC AGG GAT GCC AAT GGG 1680 
Arg Val Pro Arg Ala Ser Pro Glu lie Gin Asp Arg Asp Ala Asn Gly 
2580 2585 2590 

TCT CGC CGG CTC ATG CTA CCA CCA CCC TCT AC A CCC ACT CCC TCt GGG 1728 
Ser Arg Arg Leu Met Leu Pro Pro Pro Ser Thr Pro Thr Pro Ser Gly 
2595 2600 2605 

GGC CCT CCG AGG GGT GCG GAG TCT GTA CAC AGC TTC TAC CAT GCT GAC 1776 
Gly Pro Pro Arg Gly Ala Glu Ser Val His Ser Phe Tyr His Ala Asp 
2610 2615 2620 



25 TGC CAC TTG GAG CCA GTC CGT TGC CAG GCA CCC CCT CCC AGA TGC CCA 

Cys His Leu Glu Pro Val Arg Cys Gin Ala Pro Pro Pro Arg Cys Pro 
2625 2630 .2635 2640 



1824 



TCG GAG GCA TCT GGT AGG ACT GTG GGT AGT GGG AAG GTG TAC CCC ACT 
30 Ser Glu Ala Ser Gly Arg Thr Val Gly Ser Gly Lys Val Tyr Pro Thr 

2645 2650 2655 



1872 



35 



GTG CAT ACC AGC CCT CCA CCA GAG ATA CTG AAG GAT AAA GCA CTA GTG 
Val His Thr Ser Pro Pro Pro Glu lie Leu Lys Asp Lys Ala Leu Val 
2660 2665 2670 



1920 



GAG GTG GCC CCC AGC CCT GGG CCC CCC ACC CTC ACC AGC TTC AAC ATC 



1968 



o 
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Glu Val Ala Pro Ser Pro Gly Pro Pro Thr Leu Thr Ser Phe Asn He 
2675 2680 2685 

CCA CCT GGG CCC TTC AGC TCC ATG CAC AAG CTC CTG GAG ACA CAG AGT 2016 
5 Pro Pre Gly Pro Phe Ser Ser Met His Lys Leu Leu Glu Thr Gin Ser 

2690 2695 2700 

ACG GGA GCC TGC CAT AGC TCC TGC AAA ATC TCC AGC CCT TGC TCC AAG 2064 
Thr Gly Ala Cys His Ser Ser Cys Lys He Ser Ser Pro Cys Ser Lys 
10 2705 2710 2715 2720 

GCA GAC AGT GGA GCC TGC GGG CCG GAC AGT TGT CCC TAC TGT GCC CGG 2112 
Ala Asp Ser Gly Ala Cys Gly Pro Asp Ser Cys Pro Tyr Cys Ala Arg 
2725 2730 2735 



£15 



Hi:? 



"•4 



35 



ACA GGA GCA GGA GAG CCA GAG TCC GCT GAC CAT GTC ATG CCT GAC TCA 2160 



£3 Thr Gly Ala Gly Glu Pro Glu Ser Ala Asp His Val Met Pro Asp Ser 

f -? 2740 2745 2750 



20 GAC AGC GAG GCT GTG TAT GAG TTC ACA CAG GAC GCT CAG CAC AGT GAC 2203 

Asp Ser Glu Ala Val Tyr Glu Phe Thr Gin Asp Ala Gin His Ser Asp 
2755 2760 2765 

CTC CGG GAT CCC CAC AGC CGG CGG CGA CAG CGG AGC CTG GGC CCA GAT 2256 

25 Ley Arg Asp Pro His Ser Arg Arg Arg Gin Arg Ser Leu Gly Pro Asp 

2770 2775 2780 

GCA GAG CCT AGT TCT GTG CTG GCT TTC TGG AGG CTG ATC TGT GAC ACA 2304 

Ala Glu Pro Ser Ser Val Leu Ala Phe Trp Arg Leu lie Cys Asp Thr 
30 2785 2790 2795 2800 

TTC CGG AAG ATC GTA GAT AGC AAA TAC TTT GGC CGG GGA ATC ATG ATC 2352 

Phe Arg Lys He Val Asp Ser Lys Tyr Phe Gly Arg Gly He Met He 

2805 2810 2815 



GCC ATC CTG GTC AAT ACA CTC AGC ATG GGC ATC GAG TAC CAC GAG CAG 2400 
Ala lie Leu Val Asn Thr Leu Ser Met Gly lie Glu Tyr His Glu Gin 



2820 



47 

2825 



2830 



CCC GAG GAG CTC ACC AAC GCC CTG GAA ATC AGC AAC ATC GTC TTC ACC 24 48 

Pro Glu Glu Leu Thr Asn Ala Leu Glu He Ser Asn lie Val Phe Thr 
2835 2840 2845 

AGC CTC TTC GCC TTG GAG AT G CTG CTG AAA CTG CTT GTC TAC GGT CCC 24 96 

Ser Leu Phe Ala Leu Glu Met Leu Leu Lys Leu Leu Val Tyr Gly Pro 
2850 2855 2860 

TTT GGC TAC ATT AAG AAT CCC TAC AAC ATC TTT GAT GGT GTC ATT GTG 2544 
Phe Gly Tyr He Lys Asn Pro Tyr Asn He Phe Asp Gly Val lie Val 
2865 2870 2875 2880 

,015 GTC ATC AGT GTG TGG GAG ATT GTG GGC CAG CAG GGA GGT GGC CTG TCG 2592 

'f£ Val He Ser Val Trp Glu He Val Gly Gin Gin Gly Gly Gly Leu Ser 

(5 2885 2890 2895 



10 



F 



30 



GTG CTG CGG ACC TTC CGC CTG ATG CGG GTG CTG AAG CTG GTG CGC TTC 2640 



^20 Val Leu Arg Thr Phe Arg Leu Met Arg Val Leu Lys Leu Val Arg Phe 

jlj 2900 2905 2910 

\= * 

CTG CCG GCC CTG CAG CGC CAG CTC GTG GTG CTC ATG AAG ACC ATG GAC 2688 
Leu Pro Ala Leu Gin Arg Gin Leu Val Val Leu Met Lys Thr Met Asp 
25 2915 2920 2925 



AAC GTG GCC ACC TTC TGC ATG CTC CTC ATG CTG TTC ATC TTC ATC TTC 2736 
Asn Val Ala Thr Phe Cys Met Leu Leu Met Leu Phe He Phe He Phe 
2930 2935 2940 



AGC ATC CTG GGC ATG CAT CTC TTT GGT TGC AAG TTC GCA TCT GAA CGG 2784 
Ser He Leu Gly Met His Leu Phe Gly Cys Lys Phe Ala Ser Glu Arg 
2945 2950 2955 2960 

35 GAT GGG GAC ACS TTG CCA GAC CGG AAG AAT TTC GAC TCC CTG CTC TGG 2832 

Asp Gly Asp Thr Leu Pro Asp Arg Lys Asn Phe Asp Ser Leu Leu Trp 
2965 2970 2975 



48 



GCC ATC GTC ACT GTC TTT CAG ATT CTG ACT CAG GAA GAC TGG AAT AAA 2880 

Ala He Val Thr Val Phe Gin lie Leu Thr Gin Glu Asp Trp Asn Lys 
2980 2985 2990 

5 

GTC CTC TAC AAC GGC ATG GCC TCC ACA TCG TCT TGG GCT GCT CTT TAC 2928 

Val Leu Tyr Asn Gly Met Ala Ser Thr Ser Ser Trp Ala Ala Leu Tyr 
2995 3000 3005 

10 TTC ATC GCC CTC ATG ACT TTT GGC AAC TAT GTG CTC TTT AAC CTG CTG 2 9*76 

Phe lie Ala Leu Met Thr Phe Gly Asn Tyr Val Leu Phe Asn Leu Leu 
3010 3015 3020 

^ GTG GCC ATT CTT GTG GAA GGA TTC CAG GCA GAG GGA GAT GCC ACC AAG 3024 

J3 15 Val Ala lie Leu Val Glu Gly Phe Gin Ala Glu Gly Asp Ala Thr Lys 

~Z 3025 3030 3035 3040 



TCT GAG TCA GAG CCT GAT TTC TTT TCG CCC AGT GTG GAT GGT GAT GGG 3072 
Ser Glu Ser Glu Pro Asp Phe Phe Ser Pro Ser Val Asp Gly Asp Gly 
M*20 3045 3050 3055 



25 



GAC AG A AAG AAG CGC TTG GCC CTG GTG GCT TTG GGA GAA CAC GCG GAA 3120 
Asp Arg Lys Lys Arg Leu Ala Leu Val Ala Leu Gly Glu His Ala Glu 
3060 3065 3070 

CTA CGA AAG AGC CTT TTG CCA CCC CTC ATC ATC CAT ACG GCT GCG ACA 3168 
Leu Arg Lys Ser Leu Leu Pro Pro Leu He He His Thr Ala Ala Thr 
3075 3080 3085 



30 CCA ATG TCA CAC CCC AAG AGC TCC AGC ACA GGT GTG GGG GAA GCA CTG 3216 

Pro Met Ser His Pro Lys Ser Ser Ser Thr Gly Val Gly Glu Ala Leu 
3090 3095 3100 

GGC TCT GGC TCT CGA CGT ACC AGT AGC AGT GGG TCC GCT GAG CCT GGA 3264 

35 Gly Ser Gly Ser Arg Arg Thr Ser Ser Ser Gly Ser Ala Glu Pro Gly 

3105 3110 3115 3120 



49 

GCT GCC CAC CAT GAG ATG AAA TGT CCG CCA AGT GCC CGC AGC 7CC CCG 3312 
Ala Ala His His Glu Met Lys Cys Pro Pro Ser Ala Arg Ser Ser Pro 
3125 3130 3135 

5 CAC AGT CCC TGG AGT GCG GCA AGC AGC TGG ACC AGC AGG CGC TCC AGC 3360 

His Ser Pro Trp Ser Ala Ala Ser Ser Trp Thr Ser Arg Arg Ser Ser 
3140 3145 3150 

AGG AAC AGC CTG GGC CGG GCC CCC AGC CTA AAG CGG AGG AGC CCG AGC 3408 
10 Arg Asn Ser Leu Gly Arg Ala Pro Ser Leu Lys Arg Arg Ser Pro Ser 

3155 3160 3165 

GGG GAG CGG AGG TCC CTG CTG TCT GGA GAG GGC CAG GAG AGT CAG GAT 3456 
Gly Glu Arg Arg Ser Leu Leu Ser Gly Glu Gly Gin Glu Ser Gin Asp 



*S 13 3170 3175 3180 



H20 



GAG GAG GAA AGT TCA GAA GAG GAC CGG GCC AGC CCA GCA GGC AGT GAC 3504 
Glu Glu Glu Ser Ser Glu Glu Asp Arg Ala Ser Pro Ala Gly Ser Asp 
3185 3190 3195 3200 

CAT CGC CAC AGG GGT TCC TTG GAA CGT GAG GCC AAG AGT TCC TTT GAC 3552 
His Arg His Arg Gly Ser Leu Glu Arg Glu Ala Lys Ser Ser Phe Asp 



^ 3205 3210 3215 



25 CTG CCT GAC ACT CTG CAG GTG CCG GGG CTG CAC CGC ACA GCC AGC GGC 3600 

Leu Pro Asp Thr Leu Gin Val Pro Gly Leu His Arg Thr Ala Ser Gly 
3220 3225 3230 

CGG AGC TCT GCC TCT GAG CAC CAA GAC TGT AAT GGC AAG TCG GCT TCA 3648 
30 Arg Ser Ser Ala Ser Glu His Gin Asp Cys Asn Gly Lys Ser Ala Ser 

3235 3240 3245 

GGG CGT TTG GCC CGC ACC CTG AGG ACT GAT GAC CCC CAA CTG GAT GGG 3696 
Gly Arg Leu Ala Arg Thr Leu Arg Thr Asp Asp Pro Gin Leu Asp Gly 
35 3250 3255 3260 



GAT GAT GAC AAT GAT GAG GGA AAT CTG AGC AAA GGG GAA CGC ATA CAA 



3744 



50 

Asp Asp Asp Asn Asp Glu Gly Asn Leu Ser Lys Gly Glu Arg lie Gin 
3265 3270 3275 3280 



GCC TGG GTC AGA TCC CGG CTT CCT GCC TGT TGC CGA GAG CGA GAT TCC 3792 
5 Ala Trp Val Arg Ser Arg Leu Pro Ala Cys Cys Arg Glu Arg Asp Ser 

3285 3290 3295 

TGG TCG GCC TAT ATC TTT CCT CCT CAG TCA AGG TTT CGT CTC CTG TGT 3840 
Trp Ser Ala Tyr lie Phe Pro Pro Gin Ser Arg Phe Arg Leu Leu Cys 
10 3300 3305 3310 



o 

£15 



CAC CGG ATC ATC ACC CAC AAG ATG TTT GAC CAT GTG GTC CTC GTC ATC 3888 
His Arg He He Thr His Lys Met Phe Asp His Val Val Leu Val He 
3315 332C 3325 

ATC TTC CTC AAC TGT ATC ACC ATC GCT ATG GAG CGC CCC AAA ATT GAC 3936 
S He Phe Leu Asn Cys He Thr He Ala Met Glu Arg Pro Lys He Asp 

5 3330 3335 3340 

H=20 CCC CAC AGC GCT GAG CGC ATC TTC CTG ACC CTC TCC AAC TAG ATC TTC 3984 

^ Pro His Ser Ala Glu Arg He Phe Leu Thr Leu Ser Asn Tyr He Phe 

3345 3350 3355 3360 

ACG GCA GTC TTT CTA GCT GAA ATG ACA GTG AAG GTG GTG GCA CTG GGC 4032 
25 Thr Ala Val Phe Leu Ala Glu Met Thr Val Lys Val Val Ala Leu Gly 

3365 3370 3375 

TGG TGC TTT GGG GAG CAG GCC TAC CTG CGC AGC AGC TGG AAT GTG CTG 4 080 

Trp Cys Phe Gly Glu Gin Ala Tyr Leu Arg Ser Ser Trp Asn Val Leu 
30 3380 3385 3390 



TT - 



7. 3 

-J 



GAC GGC TTG CTG GTG CTC ATC TCC GTC ATC GAC ATC CTG GTC TCC ATG 4128 

Asp Gly Leu Leu Val Leu He Ser Val He Asp He Leu Val Ser Met 

3395 3400 3405 

GTC TCC GAC AGC GGC ACC AAG ATC CTT GGC ATG CTG AGG GTG CTG CGG 4176 

Val Ser Asp Ser Gly Thr Lys He Leu Gly Met Leu Arg Val Leu Arg 



51 

3410 3415 3420 

CTG CTG CGG ACC CTG CGC CCA CTC AGG GTC ATC AGC CGG GCC CAG GGA 4224 
Leu Leu Arg Thr Leu Arg Pro Leu Arg Val Tie Ser Arg Ala Gin Gly 
3425 3430 3435 3440 

CTG AAG CTG GTG GTA GAG ACT CTG ATG TCA TCC CTC AAA CCC ATT GGC 4272 
Leu Lys Leu Val Val Glu Thr Leu Met Ser Ser Leu Lys Pro lie Gly 
3445 3450 3455 

AAC ATT GTG GTC ATT TGC TGT GCC TTC TTC ATC ATT TTT GGA ATT CTC 4320 
Asn lie Val Val He Cys Cys Ala Phe Phe He He Phe Gly He Leu 
_ 3460 3465 3470 

fjQ- 

£-15 GGG GTG CAG CTC TTC AAA GGG AAG TTC TTC GTG TGT CAG GGT GAG GAC 4368 

25 



10 



\\1 



Gly Val Gin Leu Phe Lys Gly Lys Phe Phe Val Cys Gin Gly Glu Asp 
3475 348C 3485 



s ACC AGG AAC ATC ACT AAC AAA TCC GAC TGC GCT GAG GCC AGC TAC CGA 4416 

1^20 Thr Arg Asn lie Thr Asn Lys Ser Asp Cys Ala Glu Ala Ser Tyr Arg 

3490 3495 3500 



TGG GTC CGG CAC AAG TAC AAC TTT GAC AAC CTG GGC CAG GCT CTG ATG 4464 

Trp Val Arg His Lys Tyr Asn Phe Asp Asn Leu Gly Gin Ala Leu Met 
25 3505 3510 3515 3520 

TCC CTG TTT GTG CTG GCC TCC AAG GAT GGT TGG GTT GAC ATC ATG TAT 4512 

Ser Leu Phe Val Leu Ala Ser Lys Asp Gly Trp Val Asp He Met Tyr 
3525 3530 3535 

30 

GAT GGG CTG GAT GCT GTG GGT GTG GAT CAG CAG CCC ATC ATG AAC CAC 4560 

Asp Gly Leu Asp Ala Val Gly Val Asp Gin Gin Pro He Met Asn His 
3540 3545 3550 

35 AAC CCC TGG ATG CTG CTA TAC TTC ATC TCC TTC CTC CTC ATC GTG GCC 4 606 

Asn Pro Trp Met Leu Leu Tyr Phe lie Ser Phe Leu Leu He Val Ala 

3555 3560 3565 



52 



G 



TTC TTT GTC CTG AAC ATG TTT GTG GGC GTG GTG GTG GAG AAC TTC CAT 4 656 

Phe Phe Val Leu Asn Met Phe Val Gly Val Val Val Glu Asn Phe His 
3570 3575 3580 

5 

AAG TGC AGA CAG CAC CAG GAG GAG GAG GAG GCG AGG CGG CGT GAG GAG 4 704 

Lys Cys Arg Gin His Gin Glu Glu Glu Glu Ala Arg Arg Arg Glu Glu 
3585 3590 3595 3600 

10 AAG CGA CTA CGG AGG CTG GAG AAA AAG AGA AGG AGT AAG GAG AAG CAG 4752 

Lys Arg Leu Arg Arg Leu Glu Lys Lys Arg Arg Ser Lys Glu Lys Gin 
3605 3610 3615 

I ATG GCC GAA GCC CAG TGC AAG CCC TAC TAC TCT GAC TAC TCG AGA TTC 4800 

i\5 Met Ala Glu Ala Gin Cys Lys Pro Tyr Tyr Ser Asp Tyr Ser Arg Phe 

k 

' 3620 3625 3630 



CGG CTC CTT GTC CAC CAC CTG TGT ACC AGC CAC TAC CTG GAC CTC TTC 484 8 

3 Arg Leu Leu Val His His Leu Cys Thr Ser His Tyr Leu Asp Leu Phe 

3635 3640 3645 

ti.- 

Ln ATC ACT GGT GTC ATC GGG CTG AAC GTG GTC ACT ATG GCC ATG GAA CAT 4896 



25 



lie Thr Gly Val lie Gly Leu Asn Val Val Thr Met Ala Met Glu His 
3650 3655 3660 

TAC CAG CAG CCC CAG ATC CTG GAC GAG GCT CTG AAG ATC TGC AAT TAC 494 4 

Tyr Gin Gin Pro Gin He Leu Asp Glu Ala Leu Lys. He Cys Asn Tyr 
3665 3670 3675 3680 



30 ATC TTT ACC GTC ATC TTT GTC TTT GAG TCA GTT TTC AAA CTT GTG GCC 4 992 

He Phe Thr Val He Phe Val Phe Glu Ser Val Phe Lys Leu Val Ala 
3685 3690 3695 

TTT GCG TTC CGC CGT TTC TTC CAG GAC AGG TGG AAC CAG CTG GAC CTG 504 C 

35 Phe Ala Phe Arg Arg Phe Phe Gin Asp Arg Trp Asn Gin Leu Asp Leu 

3700 3705 3710 



53 

GCT ATT GTG CTT CTG TCC ATC ATG GGC ATC ACA CTG GAG GAG ATT GAG 5088 
Ala He Val Leu Leu Ser He Met Gly He Thr Leu Glu Glu He Glu 
3715 3720 3725 

5 GTC AAT CTG TCG CTG CCC ATC AAC CCC ACC ATC ATC CGT ATC ATG AGG 5136 

Val Asn Leu Ser Leu Pro He Asn Pro Thr He He Arg He Met Arg 
3730 3735 3740 

GTG CTC CGC ATT GCT CGA GTT CTG AAG CTG TTG AAG ATG GCT GTG GGC 5184 
10 Val Leu Arg lie Ala Arg Val Leu Lys Leu Leu Lys Met Ala Val Gly 

3745 3750 3755 3760 

ATG CGG GCA CTG CTG CAC ACG GCG ATG CAG GCC CTG CCC CAG GTG GGG 5232 
Met Arg Ala Leu Leu His Thr Val Met Gin Ala Leu Pro Gin Val Gly 



£15 3765 3770 3775 



AAC CTG GGA CTT CTC TTC ATG TTA TTG TTT TTC ATC TTT GCA GCT CTG 5280 

Asn Leu Gly Leu Leu Phe Met Leu Leu Phe Phe He Phe Ala Ala Leu 
3780 3785 3790 

m 

^ GGC GTG GAG CTC TTT GGA GAC CTG GAG TGT GAT GAG ACA CAC CCT TGT 5328 

Gly Val Glu Leu Phe Gly Asp Leu Glu Cys Asp Glu Thr His Pro Cys 
3795 3800 3805 



M=20 
O 



25 GAG GGC TXG GGT CGG CAT GCC ACC TTT AGG AAC TTT GGT ATG GCC TTT 5376 

Glu Gly Leu Gly Arg His Ala Thr Phe Arg Asn Phe Gly Met Ala Phe 
3810 3815 3820 

CTG ACC CTC TTC CGA GTC TCC ACT GGT GAC AAC TGG AAT GGT ATT ATG 5424 
30 Leu Thr Leu Phe Arg Val Ser Thr Gly Asp Asn Trp Asn Gly He Met 

3825 3830 3835 3840 

AAG GAC CCT TCC CGG GAC TGT GAC CAG GAG TCC ACC TGC TAC AAC ACT 5472 
Lys Asp Pro Ser Arg Asp Cys Asp Gin Glu Ser Thr Cys Tyr Asn Thr 
35 3845 3850 3855 



GTC ATC TCC CCT ATC TAC TTT GTG TCC TTC GTG CTG ACG GCC CAG TTT 



552C 
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Val lie Ser Pro He Tyr Phe Val Ser Phe Val Leu Thr Ala Gin Phe 
3860 3965 3870 

GTG CTG GTC AAC GTG GTC ATA GCT GTG CTG ATG AAG CAC CTG GAA GAA 5568 
5 Val Leu Val Asn Val Val He Ala Val Leu Met Lys His Leu Glu Glu 

3875 3880 3885 

AGC AAC AAA GAG GCC AAG GAG GAG GCC GAG CTC GAG GCC GAG CTG GAG 5616 
Ser Asn Lys Glu Ala Lys Glu Glu Ala Glu Leu Glu Ala Glu Leu Glu 
10 3890 3395 3900 

CTG GAG ATG AAG ACG CTC AGC CCG CAG CCC CAC TCC CCG CTG GGC AGC 5664 
Leu Glu Met Lys Thr Leu Ser Pro Gin Pro His Ser Pro Leu Gly Ser 
3905 3910 3915 3920 



,315 



rt CCC TTC C?C TGG CCC GGG GTG GAG GGT GTC AAC AGT ACT GAC AGC CCT 5712 

i 

£D Pro Phe Leu Trp Pro Gly Val Glu Gly Val Asn Ser Thr Asp Ser Pro 



35 



3925 3930 3935 



M20 AAG CCT GGG GCT CCA CAC ACC ACT GCC CAC ATT GGA GCA GCC TCG GGC 5760 

fit 

J4 Lys Pro Gly Ala Pro His Thr Thr Ala His He Gly Ala Ala Ser Gly 

if\ r 3940 3945 3950 



TTC TCC CTT GAG CAC CCC ACG ATG GTA CCC CAC CCC GAG GAG GTG CCA 5808 
25 Phe Ser Leu Glu His Pro Thr Met Val Pro His Pro Glu Glu Val Pro 

3955 3960 3965 

GTC CCC CTA GGA CCA GAC CTG CTG ACT GTG AGG AAG TCT GGT GTC AGC 5856 
Val Pro Leu Gly Pro Asp Leu Leu Thr Val Arg Lys Ser Gly Val Ser 
30 3970 3975 3980 

CGG ACG CAC TCT CTG CCC AAT GAC AGC TAC ATG TGC CGC AAT GGG AGC 5904 
Arg Thr His Ser Leu Pro Asn Asp Ser Tyr Met Cys Arg Asn Gly Ser 
3985 ■ 3990 3995 4000 



ACT GCT GAG AGA TCC CTA GGA CAC AGG GGC TGG GGG CTC CCC AAA GCC 5952 
Thr Ala Glu Arg Ser Leu Gly His Arg Gly Trp Gly Leu Pro Lys Ala 



4005 



55 

4010 



4015 



CAG TCA GGC TCC ATC TTG TCC GT7 CAC TCC CAA CCA GCA GAC ACC AGC 
Gin Ser Gly Ser lie Leu Ser Val His Ser Gin Pro Ala Asp Thr Ser 
4020 4025 4030 



6000 



10 



£15 

PA 
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TGC ATC CTA CAG CTT CCC AAA GAT GTG CAC TAT CTG CTC CAG CCT CAT 6048 
Cys lie Leu Gin Lea Pro Lys Asp Val His Tyr Leu Leu Gin Pro His 
4035 4040 4045. 

GGG GCT CCC ACC TGG GGC GCC ATC CCT AAA CTA CCC CCA CCT GGC CGC 6096 
Gly Ala Pro Thr Trp Gly Ala lie Pro Lys Leu Pro Pro Pro Gly Arg 
4050 4055 4060 

TCC CCT CTG GCT CAG AGG CCT CTC AGG CGC CAG GCA GCA ATA AGG ACT 6144 
Ser Pro Leu Ala Gin Arg Pro Leu Arg Arg Gin Ala Ala lie Arg Thr 
4065 4070 4075 4080 

GAC TCC CTG GAT GTG CAG GGC CTG GGT AGC CGG GAA GAC CTG TTG TCA 6192 
Asp Ser Leu Asp Val Gin Gly Leu Gly Ser Arc Glu Asp Leu Leu Ser 
4085 4090 4095 

GAG GTG AST GGG CCC TCC TGC CCT CTG ACC CGG TCC TCA TCC TTC TGG 6240 
Glu Val Ser Gly Pro Ser Cys Pro Leu Thr Arg Ser Ser Ser Phe Trp 
4100 4105 4110 



30 



GGC GGG TCG AGC ATC CAG GTG CAG CAG CGT TCC GGC ATC CAG AGC AAA 6288 

Gly Gly Ser Ser lie Gin Val Gin Gin Arg Ser Gly lie Gin Ser Lys 
4115 4120 4125 

GTC TCC AAG CAC ATC CGC CTG CCA GCC CCT TGC CCA GGC CTG GAA CCC 6336 

Val Ser Lys His lie Arg Leu Pro Ala Pro Cys Pro Gly Leu Glu Pro 

4130 4135 4140 



35 AGC TGG GCC AAG GAC CCT CCA GAG ACC AGA AGC AGC TTA GAG CTG GAC 

Ser Trp Ala Lys Asp Pro Pro Glu Tar Arg Ser Ser Leu Glu Leu Asp 
4145 4X50 4155 4160 



6384 



56 





ACG 


GAG ■ 




Thr 


Glu 


5 








win. 






Glu 


Pro 


in 


CAG 


AGC 1 




Gin 


Ser 




CAC 


TCC , 


Si 5 


His 


Ser 
4210 










TGT 


CCA . 




Cys 


Pro 




4225 




AGC 


CGG 




Ser 


Arg 


25 








CCG 


GAG , 




Pro 


Glu 


30 


AGG 


AGG . 




Arg 


Arg . 




CCC 


CTT - 


35 


Pro 


Leu , 



6432 



4165 4170 4175 



4180 4185 4X90 



4195 4200 4205 



4215 4220 



4230 4235 4240 



4245 4250 4255 



4260 4265 4270 



4275 4280 4285 



6480 



6528 



6576 



6624 



6672 



6720 



6768 



6816 



4290 4295 4300 



AGT CTC TCT GGT TTG TCT TCT GAC CCA ACA GAC ATG GAC CCC 6858 
Ser Leu Ser Gly Leu Ser Ser Asp Pro Thr Asp Met Asp Pro 
4305 4310 4315 

5 

(2) INFORMATION FOR SEQ ID NO: 4: 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 7540 base pairs 
10 (33 TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: unknown 

jg (ii) MOLECULE TYPE: DMA (genomic) 

p (xi) SEQUENCE DESCRIPTION: SZQ ID NO: 4: 

s " CCGTCTCTGG CGCGGAGCGG GACGATGCTG ACCCCTTAGA TCCTGCTCCA GCTGCGCCGA 60 

^2 GGGAAGAGGG GGCGCCCCTC CCCGGACCCC CGCCCTCCAT CGGGTGGCCC CTTTTTTTTC 120 

- 'I 

r? TCTTCCTCTC GGGGGCTGC7 TCGCCGAAGG TAGCGCCTGT TACGGGCAAC CGGAGCCTGG 180 

25 GCGCGAACGA AGAAGCCGGA ACAAAGTGAG GGGAAGCCGC CCGGCTAGTC GGGGAGCCCC 24C 

CGGGAACCCA GGGGAAGCGG GACTCTACGC CAGGCGGGGC TTCCCTGAGA CCCGGCGCCC 300 

CGCGGGCAGC ATGCCCTGAG GGCAGGGGGA GCTGAGCTGA ACTGGCCCTC CTGGGGACTC 360 

30 

AGCAAGCTCT CTAGAGCCCC CCACATGCTC CCCCACCGGG TCCCCCGTTG CGTGAGGACA 420 

CCTCCTCTGA GGGGCTCCGC TCGCCCCTCT TCGGACCCCC CGGGGCCCCG GCTGGCCAGA 480 

35 GGATGGACGA GGAGGAGGAT GGAGCGGGCG CCGAGGAGTC GGGACAGCCC CGTAGCTTCA 540 



CGCAGCTCAA CGACCTGTCC GGGGCCGGGG GCGGCAGGGG CCGGGTCGAC GGAAAAGGAC 



600 



EJL2 



S8 

CCGGGCAGCG CGGACTCCGA GGCGGAGGGG CTGCCGTACC CGGCGCTAGC CCCGGTGGTT 660 

TTCTTCTACT TGAGCCAGGA CAGCCGCCCG CGGAGCTGGT GTCTCCGCAC GGTCTGTAAC 72C 

5 

CCGTGGTTCG AGCGAGTCAG TATGCTGGTC ATTCTTCTCA ACTGTGTGAC TCTGGGTATG 780 

TTCAGGCCGT GTGAGGACAT TGCCTGTGAC TCCCAGCGCT GCCGGATCCT GCAGGCCTTC 840 

10 GATGACTTCA TCTTTGCCTT CTTTGCTGTG GAAATGGTGG TGAAGATGGT GGCCTTGGGC 900 

ATCTTTGGGA AGAAATGTTA CCTGGGAGAC ACTTGGAACC GGCTTGACTT TTTCATTGTC 960 

ATTGCAGGGA TGCTGGAGTA TTCGCTGGAC CTGCAGAACG TCAGCTTCTC CGCAGTCAGG 1020 

ACAGTCCGTG TGCTGCGACC GCTCAGGGCC ATTAACCGGG TGCCCAGCAT GCGCATTCTC 1080 



*j GTCACATTAC TGCTGGACAC CTTGCCTATG CTGGGCAACG TCCTGCTGCT CTGTTTCTTC 1140 

?^20 GTCTTTTTCA TCTTTGGCAT CG?GGGCGTC CAGCTGTGGG CAGGACTGCT TCGCAACCGG 1200 



25 



Lq TGCTTCCTCC CCGAGAACT? CAGCCTCCCC CTGAGCGTGG ACCTGGAGCC TTATTACCAG 1260 

ACAGAGAATG AGGACGAGAG CCCCTTCATC TGCTCTCAGC CTCGGGAGAA TGGCATSAGA 1320 

TCCTGCAGGA GTGTGCCCAC ACTGCGTGGG GAAGGCGGTG GTGGCCCACC CTGCAGTCTG 1380 

GACTATGAGA CCTATAACAG TTCCAGCAAC ACCACCTGTG TCAACTGGAA CC AG TACT AT 14 40 

30 ACCAACTGCT CTGCGGGCGA GCACAACCCC TTCAAAGGCG CCATCAACTT TGACAACATT 1500 

GGCTATGCCT GGATCGCCAT CTTCCAGGTC ATCACACTGG AGGGCTGGGT CGACATCATG 1560 

TACTTCGTAA TGGACGCTCA CTCCTTCTAC AACTTCATCT ACTTCATTCT TCTCATCATC 1620 

GTGGGCTCCT TCTTCATGAT CAACCTGTGC CTGGTGGTGA TTGCCACGCA GTTCTCCGAG 1680 



35 



59 

ACCAAACAGC GGGAGAGTCA GCTGATGCGG GAGCAGCGTG TACGATTCCT GTCCAATGCT 1740 

AGCACCCTGG CAAGCTTCTC TGAGCCAGGC AGCTGCTATG AGGAGCTAC? CAAGTACCTG 1800 

GTGTACATCC TCCGAAAAGC AGCCCGAAGG CTGGCCCAGG TCTCTAGGGC TATAGGCGTG 1B60 

CGGGCTGGGC TGCTCAGCAG CCCAGTGGCC CGTAGTGGGC AGGAGCCCCA GCCCAGTGGC 1920 

AGCTGCACTC GCTCACACCG TCGTCTGTCT GTCCACCACC TGGTCCACCA CCATCACCAC 1980 

CACCATCACC ACTACCACCT GGGTAATGGG ACGCTCAGAG TTCCCCGGGC CAGCCCAGAG 20 40 

ATCCAGGACA GGGATGCCAA TGGGTCTCGC CGGCTCATGC TACCACCACC CTCTACACCC 2100 

ft} 

ACTCCCTCTG GGGGCCCTCC GAGGGGTGCG GAGTCTGTAC ACAGCTTCTA CCATGCTGAC 2160 

m 

{Vi TGCCACTTGG AGCCAGTCCG TTGCCAGGCA CCCCCTCCCA GATGCCCATC GGAGGCATCT 2220 



10 



H20 



-•3 



30 



GGTAGGACTG -TGGGTAGTGG GAAGGTGTAC CCCACTGTGC ATACCAGCCC TCCACCAGAG 2280 
ATACTGAAGG ATAAAGCACT AGTGGAGGTG GCCCCCAGCC CTGGGCCCCC CACCCTCACC 2340 



AGCTTCAACA TCCCACCTGG GCCCTTCAGC TCCATGCACA AGCTCCTGGA GACACAGAGT 2400 



25 ACGGGAGCCT GCCATAGCTC CTGCAAAATC TCCAGCCCTT GCTCCAAGGC AGACAGTGGA 2460 



GCCTGCGGGC CGGACAGTTG TCCCTACTGT GCCCGGACAG GAGCAGGAGA GCCAGAGTCC 2520 



GCCGACCATG TCATGCCTGA CTCAGACAGC GAGSCTGTGT ATGAGTTCAC ACAGGACGCT 2580 



CAGCACAGTG ACCTCCGGGA TCCCCACAGC CGGCGGCGAC AGCGGAGCCT GGGCCCAGAT 2640 

GCAGAGCCTA GTTCTGTGCT GGCTTTCTGG AGGCTGATCT GTGACACATT CCGGAAGATC 2700 

35 GTAGATAGCA AATACTTTGG CCGGGGAATC ATGATCGCCA TCCTGGTCAA TACACTCAGC 2760 

ATGGGCATCG AGTACCACGA GCAGCCCGAG GAGCTCACCA ACGCCCTGGA AATCAGCAAC 2820 



60 



ATCGTCTTCA CCAGCCTCTT CGCCTTGGAG ATGCTGCTGA AACTGCTTGT CTACGGTCCC 2880 



TTTGGCTACA TTAAGAATCC CTACAACATC TTTGATGGTG TCATTGTGGT CATCAG7GTG 294 Q 



TGGGAGATTG TGGGCCAGCA GGGAGGTGGC C7GTCGGTGC TGCGGACCTT CCGCCTGATG 3000 



CGGGTGCTGA AGCTGGTGCG CTTCCTGCCG GCCCTGCAGC GCCAGCTCGT GG TGCTCATG 3060 



10 AAGACCATGG ACAACGTGGC CACCTTCTGC ATGCTCCTCA TGCTGTTCAT CTTCATCTTC 3120 



AGCATCCTGG GCATGCATCT CTTTGGTTGC AAGTTCGCAT CTGAACGGGA TGGGGACACG 3180 



MJ 



25 



35 



TTGCCAGACC GGAAGAATTT CGACTCCCTG CTCTGGGCCA TCGTCACTGT CTTTCAGATT 324 0 



CTGACTCAGG AAGACTGGAA TAAAGTCCTC TACAACGGCA TGGCCTCCAC ATCGTCTTGG 3300 



W GCTGCTCTTT ACTTCATCGC CCTCATGACT TTTGGCAACT ATGTGCTCTT TAACCTGCTG 3360 



1^20 GTGGCCATTC TTGTGGAAGG ATTCCAGGCA GAGGGAGATG CCACCAAGTC TGAGTCAGAG 34 20 

w 

\f: CCTGATTTCT TTTCGCCCAG TGTGGATGGT GATGGGGACA GAAAGAAGCG CTTGGCCCTG 3480 



GTGGCTTTGG GAGAACACGC GGAACTACGA AAGAGCCTTT TGCCACCCCT CATCATCCAT 3540 



ACGGCTGCGA CACCAATGTC ACACCCCAAG AGCTCCAGCA CAGGTGTGGG GGAAGCACTG 3600 



GGCTCTGGCT CTCGACGTAC CAGTAGCAGT GGGTCCGCTG AGCCTGGAGC TGCCCACCAT 3660 
30 GAGATGAAAT GTCCGCCAAG TGCCCGCAGC TCCCCGCACA GTCCCTGGAG TGCGGCAAGC 3720 

AGCTGGACCA GCAGGCGCTC CAGCAGGAAC AGCCTGGGCC GGGCCCCCAG CCTAAAGCGG 3780 



AGGAGCCCGA GCGGGGAGCG GAGGTCCCTG CTGTCTGGAG AGGGCCAGGA GAGTCAGGAT 3840 



GAGGAGGAAA GTTCAGAAGA GGACCGGGCC AGCCCAGCAG GCAGTGACCA TCGCCACAGG 3900 



61 

GGTTCCTTGG AACGTGAGGC CAAGAGTTCC TTTGACCTGC CTGACACTCT GCAGGTGCCG 3960 

GGGCTGCACC GCACAGCCAG CGGCCGGAGC TCTGCCTCTG AGCACCAAGA CTGTAATGGC 4020 

AAGTCGGCTT CAGGGCGTT? GGCCCGCACC CTGAGGACTG ATGACCCCCA ACTGGATGGG 4080 

GATGATGACA ATGATGAGGG AAATCTGAGC AAAGGGGAAC GCATACAAGC CTGGGTCAGA 4140 

TCCCGGCTTC CTGCCTGTTG CCGAGAGCGA GATTCCTGGT CGGCCTATAT CTTTCCTCCT 4200 

CAGTCAAGGT TTCGTCTCCT GTGTCACCGG ATCATCACCC ACAAGATGTT TGACCATGTG 4260 

GTCCTCGTCA TCATCTTCCT CAACTGTA^C ACCATCGCTA TGGAGCGCCC CAAAATTGAC 4 320 

-^15 CCCCACAGCG CTGAGCGCAT CTTCCTGACC CTCTCCAACT ACATCTTCAC GGCAGTCTTT 4380 

tj 

; S3 

U 1 

£J CTAGCTGAAA TGACAGTGAA GGTGGTGGCA CTGGGCTGGT GCTTTGGGGA GCAGGCCTAC 4440 

o 



10 



4ai 



H20 



M3 



CTGCGCAGCA GCTGGAATGT GCTGGACGGC TTGCTGGTGC TCATCTCCGT CATCGACATC 4500 

CTGGTCTCCA TGGTCTCCGA CAGCGGCACC AAGATCCTTG GCATGCTGAG GGTGCTGCGG 4560 

CTGCTGCGGA CCCTGCGTCC ACTCAGGGTC ATCAGCCGGG CCCAGGGACT GAAGCTGGTG 4620 

25 GTAGAGACTC TGATGTCATC CCTCAAACCC ATTGGCAACA TTGTGGTCAT TTGCTGTGCC 4 680 

TTCTTCATCA 7TTTTGGAA? TCTCGGGGTG CAGCTCTTCA AAGGGAAGXT CTTCGTGTGT 4740 

CAGGGTGAGG ACACCAGGAA CATCACTAAC AAATCCGACT GCGCTGAGGC CAGCTACCGA 4800 

30 

TGGGTCCGGC ACAAGTACAA CTTTGACAAC CTGGGCCAGG CTCTGATGTC CCTGTTTGTG 4860 

CTGGCCTCCA AGGATGGTTG GGTTGACATC ACGTATGATG GGCTGGATGC TGTGGGTGTG 4920 

35 GATCAGCAGC CCATCATGAA CCACAACCCC TGGATGCTGC TATACTTCAT CTCCTTCCTC 4 98C 

CTCATCGTGG CCTTCTTTGT CCTGAACATG TTTGTGGGCG TGGTGGTGGA GAACTTCCAT 504 0 



62 



AAGTGCAGAC AGCACCAGGA GGAGGAGGAG GCGAGGCGGC GTGAGGAGAA GCGACTACGG 5100 



AGGCTGGAGA AAAAGAGAAG GAGTAAGGAG AAGCAGATGG CCGAAGCCCA GTGCAAGCCC 5160 



TACTACTCTG ACTACTCGAG ATTCCGGCTC CTTGTCCACC ACCTGTGTAC CAGCCACTAC 5220 



CTGGACCTCT TCATCACTGG TGTCATCGGG CTGAACGTGG TCACTATGGC CATGGAACAT 5280 



10 TACCAGCAGC CCCAGATCCT GGACGAGGCT CTGAAGATCT GCAATTACAT CTTTACCGTC 5340 



ATCTTTGTCT TTGAGTCAGT TTTCAAACTT GTGGCCTTTG CGTTCCGCCG TTTCTTCCAG 5400 



=015 



25 



35 



GACAGGTGGA ACCAGCTGGA CCTGGCTATT GTGCTTCTGT CCATCATGGG CATCACACTG 54 60 



GAGGAGATTG AGGTCAATCT GTCGCTGCCC ATCAACCCCA CCATCATCCG TATCATGAGG 5520 



GTGCTCCGCA TTGCTCGAGT TCTGAAGCTG TTGAAGATGG CTGTGGGCAT GCGGGCACTG 5580 



N20 CTGCACACGG TGATGCAGGC CCTGCCCCAG GTGGGGAACC TGGGACTTCT CTTCATGTTA 5640 



TTGTTTTTCA TCTTTGCAGC TCTGGGCGTG GAGCTCTTTG GAGACCTGGA GTGTGATGAG 5700 



ACACACCCTT GTGAGGGCTT GGGTCGGCAT GCCACCTTTA GGAACTTTGG TATGGCCTTT 57 60 



CTGACCCTC? TCCGAGTCTC CACTGGTGAC AACTGGAATG GTATTATGAA GGACCCTTCC 5820 



CGGGACTGTG ACCAGGAGTC CACCTGCTAC AACACTGTCA TCTCCCCTAT CTACTTTGTG 58 8C 
30 TCCTTCG7GC TGACGGCCCA GTTTGTGCTG GTCAACGTGG TCATAGCTGT GCTGATGAAG 5940 

CACCTGGAAG AAAGCAACAA AGAGGCCAAG GAGGAGGCCG AGCTCGAGGC CGAGCTGGAG 6000 



CTGGAGATGA AGACGCTCAG CCCGCAGCCC CACTCCCCGC TGGGCAGCCC CTTCCTCTGG 6060 



CCCGGGGTGG AGGGTGTCAA CAGTACTGAC AGCCCTAAGC CTGGGGCTCC ACACACCACT 6120 



63 

GCCCACATTG GAGCAGCCTC GGGCTTCTCC CT7GAGCACC CCACGATGG? ACCCCACCCC 6180 

GAGGAGGTGC CAGTCCCCCT AGGACCAGAC CTGCTGACTG TGAGGAAGTC TGGTGTCAGC 6240 

CGGACGCACT CTCTGCCCAA TGACAGCTAC ATGTGCCGCA ATGGGAGCAC TGCTGAGAGA 6300 

TCCCTAGGAC ACAGGGGCTG GGGGCTCCCC AAAGCCCAGT CAGGCTCCAT C1TGTCCGTT 6360 

CACTCCCAAC CAGCAGACAC CAGCTGCATC CTACAGCTTC CCAAAGATGT GCACTATCTG 6420 

CTCCAGCCTC ATGGGGCTCC CACCTGGGGC GCCATCCCTA AACTACCCCC ACCTGGCCGC 6480 

TCCCCTCTGG CTCAGAGGCC TCTCAGGCGC CAGGCAGCAA TAAGGACTGA CTCCCTGGAT 6540 

i 

\215 GTGCAGGGCC TGGGTAGCCG GGAAGACCTG TTGTCAGAGG TGAGTGGGCC CTCCTGCCCT 6600 



10 



41 



H20 



30 



CTGACCCGGT CCTCATCCTT CTGGGGCGGG TCGAGCATCC AGGTGCAGCA GCGTTCCGGC 6660 



ATCCAGAGCA AAGTCTCCAA GCACATCCGC CTGCCAGCCC CTTGCCCAGG CCTGGAACCC 67 20 



AGCTGGGCCA AGGACCCTCC AGAGACCAGA AGCAGCTTAG AGCTGGACAC GGAGCTGAGC 6"? 80 



TGGATTTCAG GAGACCTCCT TCCCAGCAGC CAGGAAGAAC CCCTGTTCCC ACGGGACCTG 6840 



25 AAGAAGTGCT ACAGTGTAGA GACCCAGAGC TGCAGGCGCA GGCCTGGGTT CTGGCTAGAT 6900 



GAACAGCGGA GACACTCCAT TGCTGTCAGC TGTCTGGACA GCGGCTCCCA ACCCCGCCTA 6960 



TGTCCAAGCC CCTCAAGCCT CGGGGGCCAA CCTCTTGGGG GTCCTGGGAG CCGGCCTAAG 7020 



AAAAAACTCA GCCCACCCAG TATCTCTATA GACCCCCCGG AGAGCCAGGG CTCTCGGCCC 7 080 



CCATGCAGTC CTGGTGTCTG CCTCAGGAGG AGGGCGCCGG CCAGTGACTC TAAGGATCCC 7140 
35 TCGGTCTCCA GCCCCCTTGA CAGCACGGCT GCCTCACCCT CCCCAAAGAA AGACACGCTG 7200 

AGTCTCTCTG GTTTGTCTTC TGACCCAACA GACATGGACC CCTGAGTCCT ACCCACTCTC 7260 



64 



CCCCATCACC TTTCTCCACC GGGTGCAGAT CCTACGTCCG CCTCCTGGGC AGCGTTTCTG 7320 

AAAAGTCCCA CGTAAGCAGC AAGCAGCCAC GAGGCACCTC ACCTGCCTTC TTCAGTGGCT 7380 

GGTGGGGATG ACGAGCAGAA CTTCCGGAGA GTCGATCTGA AGAGAACACA GCCCTGGAGC 74 40 

CCCTGCCTCC GGGAAGAAGG AAAAGGAGAA GCCCAGTGTG GCCAAGGCTC CCGACACCAG 7500 

GAGCTGTTGG GAGAAGCAAT ACGTTTGTGC AGAATCTCTA 7540 



(2) INFORMATION FOR SEQ ID NO: 5: 

(ij SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2297 amino acids 
(5) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: protein 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

Met Leu Pro His Arg Val Pro Arc Cys Val Arg Thr Pro Pro Leu Arg 
15 10 15 

Gly Sex Ala Arg Pro Ser Ser Asp Pro Pro Gly Pro Arg Leu Ala Arg 

20 25 30 

Gly Trp Thr Arg Arg Arg Met Glu Arg Ala Fro Arg Ser Arg Asp Ser 
35 40 45 

Pro Val Ala Ser Arg Ser Ser Thr Thr Cys Pro Gly Pro Gly Ala Ala 



50 



65 

55 



60 



-Gly Ala Gly Ser Thr Glu Lys Asp Pro Gly Ser Ala Asp Ser Glu Ala 
65 70 75 80 

Glu Gly Leu Pro Tyr Pro Ala Leu Ala Pro Val Val Phe Phe Tyr Leu 
85 90 95 

Ser Gin Asp Ser Arg Pro Arg Ser Trp Cys Leu Arg Thr Val Cys Asn 
100 105 110 

Pro Trp Phe Glu Arg Val Ser Met Leu Val He Leu Leu Asn Cys Val 
115 120 125 

Thr Leu Gly Met Phe Arg Pro Cys Glu Asp He Ala Cys Asp Ser Gin 
130 135 140 

Arg Cys Arg lie Leu Gin Ala Phe Asp Asp Phe He Phe Ala Phe Phe 
145 150 155 160 

Ala Val Glu Met Val VaL Lys Met Val Ala Leu Gly He Phe Gly Lys 
165 170 175 

Lys Cys Tyr Leu Gly Asp Thr Trp Asn Arg Leu Asp Phe Phe He Val 
180 185 190 

He Ala Gly Met Leu Glu Tyr Ser Leu Asp Leu Gin Asn Val Ser Phe 
195 200 205 

Ser Ala Val Arg Thr Val Arg Val Leu Arg Pro Leu Arg Ala He Asn 
, 210 215 220 

Arg Val Pro Ser Met Arg lie Leu Val Thr Leu Leu Leu Asp Thr Leu 
225 230 235 240 



Pro Met Leu Gly Asn Val Leu Leu Leu Cys Phe Phe Val Phe Phe He 
245 250 255 



66 



Phe Gly lie Val Gly Val Gin Leu Trp Ala Gly Leu Leu Arg Asn Arg 
260 265 270 

Cys Phe Leu Pro Glu Asn Phe Ser Leu Pro Leu Ser Val Asp Leu Glu 
275 280 285 

Pro Tyr Tyr Gin Thr Glu Asn Glu Asp Glu Ser Pro Phe He Cys Ser 
290 295 300 

Gin Pro Arg Glu Asn Gly Met Arg Ser Cys Arg Ser Val Pro Thr Leu 
305 310 315 320 

Arg Gly Glu Gly Gly Gly Gly Pro Pro Cys Ser Leu Asp Tyr Glu Thr 
325 330 335 

Tyr Asn Ser Ser Ser Asn Thr Thr Cys Val Asn Trp Asn Gin Tyr Tyr 
340 345 350 

Thr Asn Cys Ser Ala Gly Glu His Asn Pro Phe Lys Gly Ala He Asn 
355 360 365 

Phe Asp Asn He Gly Tyr Ala Trp He Ala He Phe Gin Val He Thr 
370 375 380 

Leu Glu Gly Trp Val Asp He Met Tyr Phe Val Met Asp Ala His Ser 
395 390 395 400 

Phe Tyr Asn Phe He Tyr Phe He Leu Leu He He Val Gly Ser Phe 
405 410 415 



Phe Met He Asn Leu Cys Leu Val Val He Ala Thr Gin Phe Ser Glu 
420 425 430 



Thr Lys Gin Arg Glu Ser Gin Leu Met Arg Glu Gin Arg Val Arg Phe 
435 440 445 



67 

Leu Ser Asn Ala Ser Thr Leu Ala Ser Phe Ser Glu Pro Gly Ser Cys 
450 455 460 



Tyr Glu Glu Leu Leu Lys Tyr Leu Val Tyr lie Leu Arg Lys Ala Ala 
465 470 475 480 

Arg Arg Leu Ala Gin Val Ser Arg Ala lie Gly Val Arg Ala Gly Leu 
485 490 495 

Leu Ser Ser Pro Val Ala Arg Ser Gly Gin Glu Pro Gin Pro Ser Gly 
500 505 510 

Ser Cys Thr Arg Ser His Arg Arg Leu Ser Val His His Leu Val His 
515 520 525 

His His His His His His His His Tyr His Leu Gly Asn Gly Thr Leu 
530 535 540 

Arg Val Pro Arg Ala Ser Pro Glu lie Gin Asp Arg Asp Ala Asn Gly 
545 550 555 * 560 

Ser Arg Arg Leu Met Leu Pro Pro Pro Ser Thr Pro Thr Pro Ser Gly 
565 570 575 

Gly Pro Pro Arg Gly Ala Glu Ser Val His Ser Phe Tyr His Ala Asp 
580 585 590 

Cys His Leu Glu Pro Val Arg Cys Gin Ala Pro Pro Pro Arg Cys Pro 
595 6C0 605 

Ser Glu Ala Ser Gly Arg Thr Val Gly Ser Gly Lys Val Tyr Pro Thr 
610 615 620 

Val His Thr Ser Pro Pro Pro Glu lie Leu Lys Asp Lys Ala ^eu Val 
625 . 630 635 640 

Glu Val Ala Pro Ser Pro Gly Pro Pro Thr Leu Thr Ser Phe Asn He 



68 



645 



650 



655 



Pro Pro Gly Pro Phe Ser Ser Met His Lys Leu Leu Glu Thr Gin Ser 
660 665 670 

Thr Gly Ala Cys His Ser Ser Cys Lys He Ser Ser Pro Cys Ser Lys 
675 680 685 

Ala Asp Ser Gly Ala Cys Gly Pro Asp Ser Cys Pro Tyr Cys Ala Arg 
690 695 700 

Thr Gly Ala Gly Glu Pro Glu Ser Ala Asp His Val Met Pro Asp Ser 
705 710 715 720 

Asp Ser Glu Ala Val Tyr Glu Phe Thr Gin Asp Ala Gin His Ser Asp 
725 730 735 

Leu Arg Asp Pro His Ser Arg Arg Arg Gin Arg Ser Leu Gly Pro Asp 
740 745 750 

Ala Glu Pro Ser Ser Val Leu Ala Phe Trp Arg Leu He Cys Asp Thr 
755 760 765 

Phe Arg Lys He Val Asp Ser Lys Tyr Phe Gly Arg Gly He Met He 
770 775 780 

Ala He Leu Val Asn Thr Leu Ser Met Gly He Glu Tyr His Glu Gin 
785 790 795 800 

Pro Glu Glu Leu Thr Asn Ala Leu Glu He Ser Asn He Val Phe Thr 
805 810 ' 815 



Ser Leu Phe Ala Leu Glu Met Leu Leu Lys Leu Leu Val Tyr Gly Pro 
820 825 830 



Phe Gly Tyr He Lys Asn Pro Tyr Asn He Phe Asp Gly Val He Val 
835 840 845 



69 

Val He Ser Val Trp Glu lie Val Gly Gin Gin Gly Gly Gly Leu Ser 
850 855 860 

Val Leu Arg Thr Phe Arg Leu Met Arg Val Leu Lys Leu Val Arg Phe 
865 870 875 880 

Leu Pro Ala Leu Gin Arg Gin Leu Val Val Leu Met Lys Thr Met Asp 
885 890 895 

Asn Val Ala Thr Phe Cys Met Leu Leu Met Leu Phe lie Phe lie Phe 
900 905 910 

Ser lie Leu Gly Met His Leu. Phe Gly Cys Lys Phe Ala Ser Glu Arg 
915 920 925 

Asp Gly Asp Thr Leu Pro Asp Arg Lys Asn Phe Asp Ser Leu Leu Trp 
930 935 940 

Ala He Val Thr Val Phe Gin He Leu Thr Gin Glu Asp Trp Asn Lys 
945 950 955 960 

Val Leu Tyr Asn Gly Met Ala Ser Thr Ser Ser Trp Ala Ala Leu Tyr 
965 970 975 

Phe He Ala Leu Met Thr Phe Gly Asn Tyr Val Leu Phe Asn Leu Leu 
980 985 990 

Val Ala He Leu Val Glu Gly Phe Gin Ala Glu Gly Asp Ala Thr Lys 
995 1000 1005 

Ser Glu Ser Glu Pro Asp Phe Phe Ser Pro Ser Val Asp Gly Asp Gly 
1010 1015 1020 



Asp Arg Lys Lys Arg Leu Ala Leu Val Ala Leu Gly Glu His Ala Glu 
1025 1030 1035 1040 



70 

Leu Arg Lys Ser Leu Leu ?ro Pro Leu He He His Thr Ala Ala Thr 
1045 1050 1055 

Pro Met Ser His Pro Lys Ser Ser Ser Thr Gly Val Gly Glu Ala Leu 
1060 1065 1070 

Gly Ser Gly Ser Arg Arg Thr Ser Ser Ser Gly Ser Ala Glu Pro Gly 
1075 1080 1085 

Ala Ala His His Glu Met Lys Cys Pro Pro Ser Ala Arg Ser Ser Pro 
1090 1095 1100 

His Ser Pro Trp Ser Ala Ala Ser Ser Trp Thr Ser Arg Arg Ser Sex 
1105 1110 1115 1120 

Arg Asn Ser Leu Gly Arg Ala Pro Ser Leu Lys Arg Arg Ser Pro Ser 
1125 1130 1135 

Gly Glu Arg Arg Ser Leu Leu Ser Gly Glu Gly Gin Glu Ser Gin Asp 
1140 1145 1150 

Glu Glu Glu Ser Ser Glu Glu Asp Arg Ala Ser Pro Ala Gly Ser Asp 
1155 1160 1165 

His Arg His Arg Gly Ser Leu Glu Arg Glu Ala Lys Ser Ser Phe Asp 
1170 1175 1180 

Leu Pro Asp Thr Leu Gin Val Pro Gly Leu His Arg Thr Ala Ser Gly 
1185 H90 1195 1200 

Arg Ser Ser Ala Ser Glu His Gin Asp Cys Asn Gly Lys Ser Ala Ser 
1205 1210 1215 

Gly Arg Leu Ala Arg Thr Leu Arg Thr Asp Asp Pro Gin Leu Asp Gly 
1220 1225 1230 



Asp Asp Asp Asn Asp Glu Gly Asn Leu Ser Lys Gly Glu Arg He Gin 



1235 



71 

1240 



1245 



Ala Trp Val Arg Ser Arg Leu Pro Ala Cys Cys Arg Glu Arg Asp Ser 
1250 1255 1260 

Trp Ser Ala Tyr lie Phe Pro Pro Gin Ser Arg Phe Arg Leu Leu Cys 
1265 1270 1275 1280 

His Arg He He Thr His Lys Met Phe Asp His Val Val Leu Val Tie 
1285 1290 1295 

He Phe Leu Asn Cys lie Thr He Ala Met Glu Arg Pro Lys He Asp 
1300 1305 1310 

Pro His Ser Ala Glu Arg He Phe Leu Thr Leu Ser Asn Tyr He Phe 
1315 1320 1325 

Thr Ala Val Phe Leu Ala Glu Met Thr Val Lys Val Val Ala Leu Gly 
1330 1335 1340 

Trp Cys Phe Gly Glu Gin Ala Tyr Leu Arg Ser Ser Trp Asn Val Leu 
1345 1350 1355 1360 

Asp Gly Leu Leu Val Leu He Ser Val He Asp He Leu Val Ser Met 
1365 1370 1375 

Val Ser Asp Ser Gly Thr Lys He Leu Gly Met Leu Arg Val Leu Arg 
1380 1385 1390 

Leu Leu Arg Thr Leu Arg Pro Leu Arg Val He Ser Arg Ala Gin Gly 
1395 1400 1405 

Leu Lys Leu Val Val Glu Thr Leu Met Ser Ser Leu Lys Pro He Gly 
1410 1415 1420 



Asn He Val Val He Cys Cys Ala Phe Phe He He Phe Gly He Leu 
1425 1430 1435 1440 



72 

Gly Val Gin Leu Phe Lys Gly Lys Phe Phe Val Cys Gin Gly Glu Asp 
1445 1450 1455 

Thr Arg Asn He Thr Asn Lys Ser Asp Cys Ala Glu Ala Ser Tyr Arg 
1460 1465 1470 

Trp Val Arg His Lys Tyr Asn Phe Asp Asn Leu Gly Gin Ala Leu Met 
1475 1480 1485 

Ser Leu Phe Val Leu Ala Ser Lys Asp Gly Trp Val Asp He Met Tyr 
1490 1495 1500 

Asp Gly Leu Asp Ala Val Gly Val Asp Gin Gin Pro He Met Asn His 
1505 1510 1515 1520 

Asn Pro Trp Met Leu Leu Tyr Phe He Ser Phe Leu Leu He Val Ala 
1525 1530 1535 

Phe Phe Val Leu Asn Met Phe Val Gly Val Val Val Glu Asn Phe His 
1540 1545 1550 

Lys Cys Arg Gin His Gin Glu Glu Glu Glu Ala Arg Arg Arg Glu Glu 
1555 1560 1565 

Lys Arg Leu Arg Arg Leu Glu Lys Lys Arg Arg Asn Leu Met Leu Asp 
1570 1575 1580 

Asp Val He Ala Ser Gly Ser Ser Ala Ser Ala Ala Ser Glu Ala Gin 
1585 1590 1595 1600 

Cys Lys Pro Tyr Tyr Ser Asp Tyr Ser Arg Phe Arg Leu Leu Val His 
1605 1610 1615 



His Leu Cys Thr Ser His Tyr ieu Asp Leu Phe lie Thr Gly Val He 
1620 1625 1630 



73 

Gly Leu Asn Val Val Thr Met Ala Met Glu Kis Tyr Gin Gin Pro Gin 
1635 1640 1645 



lie Leu Asp Glu Ala Leu lys He Cys Asn Tyr He Phe Thr Val He 
1650 1655 1660 

Phe Val Phe Glu Ser Val Phe Lys Leu Val Ala Phe Ma Pne Arg Arg 
1665 1670 1675 1680 

Phe Phe Gin Asp Arg Trp Asn Gin Leu Asp Leu Ala He Val Leu Leu 
1685 1690 1695 

Ser He Met Gly He Thr Leu Glu Glu He Glu Val Asn Leu Ser Leu 
1700 1705 1710 

Pro He Asr. Pro Thr He He Arg He Met Arg Val Leu Arg He Ala 
1715 1720 1725 

Arg Val Leu Lys Leu Leu Lys Met Ala Val Gly Met Arg Ala Leu Leu 
1730 1735 1740 

His Thr Val Met Gin Ala Leu Pro Gin Val Gly Asn Leu Gly Leu Leu 
1745 1750 1755 1760 

Phe Met Leu Leu Phe Phe He Phe Ala Ala Leu Gly Val Glu Leu Phe 
1765 1770 1775 

Gly Asp Leu Glu Cys Asp Glu Thr His Pro Cys Glu Gly Leu Gly Arg 
1780 1785 1790 

His Ala Thr Phe Arg Asn Phe Gly Met Ala Phe Leu Thr Leu Phe Arg 
1795 1800 1805 

Val Ser Thr Gly Asp Asn Trp Asn Gly He Met Lys Asp Pro Ser Arg 
1810 1815 1820 

Asp Cys Asp Gin Glu Ser Thr Cys Tyr Asn Thr Val He Ser Pro He 



1825 
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1830 1835 



1840 



Tyr Phe Val Ser Phe Val Leu Thr Ala Gin Phe Vsl Leu Val Asn Val 
1845 1850 1855 

Val lie Ala Val Leu Met Lys His Leu Glu Glu Ser Asn Lys Glu Ala 
1860 1865 1870 

Lys Glu Glu Ala Glu Leu Glu Ala Glu Leu Glu Leu Glu Met Lys Thr 
1875 1880 1885 

Leu Ser pro Gin Pro His Ser Pro Leu Gly Ser Pro Phe Leu Trp Pro 
1890 1895 1900 

Gly Val Glu Gly Val Asn Ser Thr Asp Ser Pro Lys Pro Gly Ala Pro 
1905 1910 1915 1920 

His Thr Thr Ala His He Gly Ala Ala Ser Gly Phe Ser Leu Glu His 
1925 1930 1935 

Pro Thr Met Val Pro His Pro Glu Glu Val Pro Val Pro Leu Gly Pro 
1940 1945 1950 

Asp Leu Leu Thr Val Arg Lys Ser Gly Val Ser Arg Thr His Ser Leu 
1955 1960 1965 

Pro Asn Asp Ser Tyr Met Cys Arg Asn Gly Ser Thr Ala Glu Arg Ser 
1970 1975 1980 

Leu Gly His Arg Gly Trp Gly Leu Pro Lys Ala Gin Ser Gly Ser lie 
1985 1990 1995 2000 

Leu Ser Val His Ser Gin Pro Ala Asp Thr Ser Cys He Leu Gin Leu 
2005 2010 2015 

Pro Lys Asp Val His Tyr Leu Leu Gin Pro His Gly Ala Pro Thr Trp 
2020 2025 2030 
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Gly Ala lie Pro Lys Leu Pro Pro Pro Gly Arg Ser Pro Leu Ala Gin 
2035 2040 2045 

Arg Pro Leu Arc Arg Gin Ala Ala lie Arg Thr Asp Ser Leu Asp Val 
2050 2055 2060 

Gin Gly Leu Gly Ser Arg Glu Asp Leu Leu Ser Glu Val Ser Gly Pro 
2065 2070 ■ . 2075 - 2080 

Ser Cys Pro Leu Thr Arg Ser Ser Ser Phe Trp Gly Gly Ser Ser He 
2085 2090 2095 

Gin Val Gin Gin Arg Ser Gly lie Gin Ser Lys Val Ser Lys His He 
2100 2105 2110 

Arg Leu Pro Ala Pro Cys Pro Gly Leu Glu Pro Ser Trp Ala Lys Asp 
2115 2120 2125 

Pro Pro Glu Thr Arg Ser Ser Leu Glu Leu Asp Thr Glu Leu Ser Irp 
2130 2135 2140 

He Ser Gly Asp Leu Leu Pro Ser Ser Gin Glu Glu Pro Leu Phe Pro 
2145 2150 2155 - 2160 

Arg Asp Leu Lys Lys Cys Tyr Ser Val Glu Thr Gin Ser Cys Arg Arg 
2165 2170 2175 

Arg Pro Gly Phe Trp Leu Asp Glu Gin Arg Arg His Ser He Ala Val 
2160 2185 2190 

Ser Cys Leu Asp Ser Gly Ser Gin Pro Arg Leu Cys Pro Ser Pro Ser 
2195 2200 2205 



Ser Leu Gly Gly Gin Pro Leu Gly Gly Pro Gly Ser Arg Pro Lys Lys 
2210 2215 2220 
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Lys Leu Ser Pro Pro Ser He Ser lie Asp Pro Pro Glu Ser Gin Gly 
2225 2230 2235 2240 



Ser Arg Pro Pro Cys Ser Pro Gly Val Cys Leu Arg Arg Arg Ala Pro 
2245 2250 2255 

Ala Ser Asp Ser Lys Asp Pro Ser Val Ser Ser Pro Leu Asp Ser Thr 
2260 2265 2270 

Ala Ala Ser Pro Ser Pro Lys Lys Asp Thr Leu Ser Leu Ser Gly Leu 
2275 2280 2285 

Ser Ser Asp Pro Thr Asp Met Asp Pro 
2290 2295 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2304 amino acids 
{B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Met Leu Pro His Arg Val Pro Arg Cys Val Arg Thr Pro Pro Leu Arg 
1 5 10 15 

Gly Ser Ala Arg Pro Ser Ser Asp Pro Pro Gly Pro Arg Leu Ala Arg 
20 25 30 

Gly Trp Thr Arg Arg Arg Met Glu Arg Ala Pro Arg Ser Arg Asp Ser 
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40 



45 



Pro Val Ala Ser Arg Ser Ser Thr Thr Cys Pro Gly Pro Gly Ala Ala 
50 55 60 

Gly Ala Gly Ser Thr Glu Lys Asp Pro Gly Ser Ala Asp Ser Glu Ala 
65 70 75 80 

Glu Gly Leu Pro Tyr Pro Ala Leu Ala Pro Val Val Phe Phe Tyr Leu 
85 90 95 

Ser Gin Asp Ser Arg Pro Arg Ser Trp Cys Leu Arg Thr Val Cys Asn 
100 105 110 

Pro Trp Phe Glu Arg Val Ser Met Leu Val lie Leu Leu Asn Cys Val 
115 12C 125 

Thr Leu Gly Met Phe Arg Pro Cys Glu Asp lie Ala Cys Asp Ser Gin 
' 130 135 140 

Arg Cys Arg lie Leu Gin Ala Phe Asp Asp Phe lie Phe Ala Phe Phe 
145 150 155 160 

Ala Val Glu Met Val Val Lys Met Val Ala Leu Gly He Phe Gly Lys 
165 170 175 

Lys Cys Tyr Leu Gly Asp Thr Trp Asn Arg Leu Asp Phe Phe He Val 
160 185 190 

He Ala Gly Met Leu Glu Tyr Ser Leu Asp Leu Gin Asn Val Ser Phe 
195 200 205 

Ser Ala Val Arg Thr Val Arg Val Leu Arg Pro Leu Arg Ala He Asn 
210 215 220 

Arg Val Pro Ser Met Arg He Leu Val Thr Leu Leu Leu Asp Thr Leu 
225 230 235 240 
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Pro Met Leu Gly Asn Val Leu Leu Leu Cys Phe Phe Val Phe Phe He 
245 250 255 

Phe Gly lie Val Gly Val Gin Leu Trp Ala Gly Leu Leu Arg Asn Arg 
260 265 270 

Cys Phe Leu Pro Glu Asn Phe Ser Leu Pro Leu Ser Val Asp Leu Glu 
275 280 285 

Pro Tyr Tyr Gin Thr Glu Asn Glu Asp Glu Ser Pro Phe lie Cys Ser 
290 295 300 

Gin Pro Arg Glu Asn Gly Met Arg Ser Cys Arg Ser Val Pro Thr Leu 
305 310 315 320 

Arg Gly Glu Gly Gly Gly Gly Pro Pro Cys Ser Leu Asp Tyr Glu Thr 
325 330 335 

Tyr Asn Ser Ser Ser Asn Thr Thr Cys Val Asn Trp Asn Gin Tyr Tyr 
340 345 350 

Thr Asn Cys Ser Ala Gly Glu His Asn Pro Phe Lys Gly Ala He Asn 
355 360 365 

Phe Asp Asn lie Gly Tyr Ala Trp He Ala He Phe Glh Val He Thr 
370 375 380 

Leu Glu Gly Trp Val Asp lie Met Tyr Phe Val Met Asp Ala His Ser 
385 390 395 400 



Phe Tyr Asn Phe He Tyr Phe He Leu Leu 'lie lie Val Gly Ser Phe 
405 410 415 



Phe Met He Asn Leu Cys Leu Val Val lie Ala Thr Gin Phe Ser Glu 
420 425 430 
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Thr Lys Gin Arg Glu Ser Gin Leu Met Arg Glu Gin Arg Val Arg Phe 
435 440 445 



Leu Ser Asn Ala Ser Thr Leu Ala Ser Phe Ser Glu Pro Gly Ser Cys 

450 455 460 

Tyr Glu Glu Leu Leu Lys Tyr Leu Val Tyr lie Leu Arg Lys Ala Ala 

465 470 475 480 
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Arg Arg Leu Ala Gin Val Ser Arg Ala He Gly Val Arg Ala Gly Leu 
485 490 495 
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Leu Ser Ser Pro Val Ala Arg Ser Gly Gin Glu Pro Gin Pro Ser Gly 

500 505 510 

Ser Cys Thr Arg Ser His Arg Arg Leu Ser Val His His Leu Val His 

515 520 525 
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His His His His His His His His Tyr His Leu Gly Asn Gly Thr Leu 
530 535 540 
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Arg Val Pro Arg Ala Ser Pro Glu He Gin Asp Arg Asp Ala Asn Gly 
545 550 555 560 
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Ser Arg Arg Leu Met Leu Pro Pro Pro Ser Thr Pro Thr Pro Ser Gly 
565 570 575 
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Gly Pro Pro Arg Gly Ala Glu Ser Val His Ser Phe Tyr His Ala Asp 
580 585 590 

Cys His Leu Glu Pro Val Arg Cys Gin Ala Pro Pro Pro Arg Cys Pro 

595 600 605 
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Ser Glu Ala Ser Gly Arg Thr Val Gly Ser Gly Lys Val Tyr Pro Thr 
610 615 620 



Val His Thr Ser Pro Pro Pro Glu He Leu Lys Asp Lys Ala Leu Val 



625 



630 
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635 
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Glu Val Ala Pro Ser Pro Gly Pro Pro Thr Leu Thr Ser Phe Asn lie 
645 650 655 

Pro Pro Gly Pro Phe Ser Ser Met His Lys Leu Leu Glu Thr Gin Ser 
660 665 670 

Thr Gly Ala Cys His Ser Ser Cys Lys lie Ser Ser Pro Cys Ser Lys 
675 680 685 

Ala Asp Ser Gly Ala Cys Gly Pro Asp Ser Cys Pro Tyr Cys Ala Arg 
690 695 700 

Thr Gly Ala Gly Glu Pro Glu Ser Ala Asp His Val Met Pro Asp Ser 
705 710 715 720 

Asp Ser Glu Ala Val Tyr Glu Phe Thr Gin Asp Ala Gin His Ser Asp 
725 730 735 

Leu Arg Asp Pro His Ser Arg Arc Arg Gin Arg Ser Leu Gly Pro Asp 
740 745 750 

Ala Glu Pro Ser Ser Val Leu Ala Phe Trp Arg Leu lie Cys Asp Thr 
755 760 765 

Phe Arg Lys He Val Asp Ser Lys Tyr Phe Gly Arg Gly lie Met He 
770 775 780 

Ala He Leu Val Asn Thr Leu Ser Met Gly He Glu Tyr His Glu Gin 
785 790 795 900 

Pro Glu Glu Leu Thr Asn Ala Leu Glu He Ser Asn He Val Phe Thr 
805 810 815 

Ser Leu Phe Ala Leu Glu Met Leu Leu Lys Leu Leu Val Tyr Gly Pro 
820 825 830 
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Phe Gly Tyr lie Lys Asn Pro Tyr Asn He Phe Asp Gly Val He Val 
835 840 845 

Val He Ser Val Trp Glu lie Val Gly Gin Gin Gly Gly Gly Leu Ser 
850 855 860 

Val Leu Arg Thr Phe Arg Leu Met Arg Val Leu Lys Leu Val Arg Phe 
865 870 875 880 

Leu Pro Ala Leu Gin Arg Gin Leu Val Val Leu Mez Lys Thr Met Asp 
885 890 895 

Asn Val Ala Thr Phe Cys Met Leu Leu Met Leu Phe He Phe He Phe 
900 905 910 

Ser He Leu Gly Met His Leu Phe Gly Cys Lys Phe Ala Ser Glu Arg 
915 920 925 

Asp Gly Asp Thr Leu Pro Asp Arg Lys Asn Phe Asp Ser Leu Leu Trp 
930 935 940 

Ala He Val Thr Val Phe Gin He Leu Thr Gin Glu Asp Trp Asn Lys 
945 950 955 960 

Val Leu Tyr Asn Gly Met Ala Ser Thr Ser Ser Trp Ala Ala Leu Tyr 
965 970 975 

Phe He Ala Leu Met Thr Phe Gly Asn Tyr Val Leu Phe Asn Leu Leu 
980 985 990 



Val Ala He Leu Val Glu Gly Phe Gin Ala Glu Gly Asp Ala Thr Lys 
995 13C0 1005 



Ser Glu Ser Glu Pro Asp Phe Phe Ser Pro Ser Val Asp Gly Asp Gly 
1010 1015 1020 
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Asp Arg Lys Lys Arg Leu Ala Leu Val Ala Leu Gly Glu His Ala Glu 
1025 1030 1035 1040 



Leu Arg Lys Ser Leu Leu Fro Pro Leu He He His Thr Ala Ala Thr 
1045 1050 1055 

?ro Met Ser His Pro Lys Ser Ser Ser Thr Gly Val Gly Glu Ala Leu 
1060 1065 1070 

Gly Ser Gly Ser Arg Arg Thr Ser Ser Ser Gly Ser Ala Glu Pro Gly 
1075 1080 1085 

Ala Ala His His Glu Met Lys Cys Pro Pro Ser Ala Arg Ser Ser Pro 
1090 1095 1100 

His Ser Pro Trp Ser Ala Ala Ser Ser Trp Thr Ser Arg Arg Ser Ser 
1105 1110 1115 1120 

Arg Asn Ser Leu Gly Arg Ala Pro Ser Leu Lys Arg Arg Ser Pro Ser 
1125 1130 1135 

Gly Glu Arg Arg Ser Leu Leu Ser Gly Glu Gly Gin Glu Ser Gin Asp 
1140 1145 1150 

Glu Glu Glu Ser Ser Glu Glu Asp Arg Ala Ser Pro Ala Gly Ser Asp 
1155 1160 1165 

His Arg His Arg Gly Ser Leu Glu Arg Glu Ala Lys Ser Ser Phe Asp 
1170 1175 1180 

Leu Pro Asp Thr Leu Gin Val Pro Gly Leu His Arg Thr Ala Ser Gly 
1185 1190 1195 1200 

Arg Ser Ser Ala Ser Glu His Gin Asp Cys Asn Gly Lys Ser Ala Ser 
1205 1210 1215 



Gly Arg Leu Ala Arg Thr Leu Arg Thr Asp Asp Pro Gin Leu Asp Gly 



1220 
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1225 
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Asp Asp Asp Asn Asp Glu Gly Asn Leu Ser Lys Gly Glu Arg lie Gin . 
1235 1240 1245 

Ala Trp Val Arg Ser Arg Leu Pro Ala Cys Cys Arg Glu Arg Asp Ser 
1250 1255 1260 

Trp Ser Ala Tyr lie Phe Pro Pro Gin Ser Arg Phe Arg Leu Leu Cys 
1265 1270 1275 1280 

His Arg He He Thr His Lys Met Phe Asp His Val Val Leu Val He 
1285 . 1290 1295 

He Phe Leu Asn Cys lie Thr lie Ala Met Glu Arg Pro Lys He Asp 
1300 1305 1310 

Pro His Ser Ala Glu Arg He Phe Leu Thr Leu Ser Asn Tyr He Phe 
1315 1320 1325 

Thr Ala Val Phe Leu Ala Glu Met Thr Val Lys Val Val Ala Leu Gly 
1330 1335 1340 

Trp Cys Phe Gly Glu Gin Ala Tyr Leu Arg Ser Ser Trp Asn Val Leu 
1345 1350 1355 1360 

Asp Gly Leu Leu Val Leu He Ser Val He Asp lie Leu Val Ser Met 
1365 1370 1375 

Val Ser Asp Ser Gly Thr Lys He Leu Gly Met Leu Arg Val Leu Arg 
1360 1385 1390 

Leu Leu Arg Thr Leu Arg Pro Leu Arg Val He Ser Arg Ala Gin Gly 
1395 1400 1405 

Leu Lys Leu Val Val Glu Thr Leu Met Ser Ser Leu Lys Pro He Gly 
1410 1415 1420 
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Asn lie Val Val He Cys Cys Ala Phe Phe Tie He Phe Gly He Leu 
1425 1430 1435 1440 

Gly Val Gin Leu Phe Lys Gly Lys Phe Phe Val Cys Gin Gly Glu Aso 
1445 1450 1455 

Thr Arg Asn He Thr Asn Lys Ser Asp Cys Ala Glu Ala Ser Tyr Arg 
1460 1465 1470 

Trp Val Arg His Lys Tyr Asn Phe Asp Asn Leu Gly Gin Ala Leu Met 
1475 1480 1485 

Ser Leu Phe Val Leu Ala Ser Lys Asp Gly Trp Val Asp He Met Tyr 
1490 1495 1500 

Asp Gly Leu Asp Ala Val Gly Val Asp Gin Gin Pro He Met Asn His 
1505 1510 1515 1520 

Asn Pro Trp Met Leu Leu Tyr Phe He Ser Phe Leu Leu He Val Ala 
1525 1530 1535 

Phe Phe Val Leu Asn Met Phe Val Gly Val Val Val Glu Asn Phe His 
1540 1545 1550 



Lys Cys Arg Gin His Gin Glu Glu Glu Glu Ala Arg Arg Arg Glu Glu 
1555 1560 1565 

\ _ 

Lys Arg jueu Arg Arg Leu Glu Lys Lys Arg Arg Ser Lys Glu Lys Gin 
1570 1575 1580 

Met Ala Asp Leu Met Leu Asp Asp Val He Ala Ser Gly Ser Ser Ala 
1585 1590 1595 1600 

Ser Ala Ala Ser Glu Ala Gin Cys Lys Pro Tyr Tyr Ser Asp Tyr Ser 
1605 1610 1615 
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Arg Phe Arg Leu Leu Val His His Leu Cys Thr Ser His Tyr Leu Asp 
1620 1625 1630 



Leu Phe lie Thr Gly Val He Gly Leu Asn Val Val Thr Met Ala Met 
1635 1.640 1645 

Glu His Tyr Gin Gin Pro Gin He Leu Asp Glu Ala Leu Lys He Cys 
1650 1655 1660 

Asn Tyr He Phe Thr Val He Phe Val Phe Glu Ser Val Phe Lys Leu 
1665 1670 1675 1680 

Val Ala Phe Ala Phe Arg Arg Phe Phe Gin Asp Arg Trp Asn Gin Leu 
1685 1690 1695 

Asp Leu AJa He Val Leu Leu Ser He Met Gly He Thr Leu Glu Glu 
1700 1705 1710 

He Glu Val Asn Leu Ser Leu Pro He Asn Pro Thr He He Arg lie 
1715 1720 1725 

t 

Met Arg Val Leu Arg He Ala Arg Val Leu Lys Leu Leu Lys Met Ala 
1730 1735 1740 

Val Gly Met Arg Ala Leu Leu His Thr Val Met Gin Ala Leu Pro Gin 
1745 1750 1755 1760 

Val Gly Asn Leu Gly Leu Leu Phe Met Leu Leu Phe Phe He Phe Ala 
1765 1770 1775 

Ala Leu Gly Val Glu Leu Phe Gly Asp Leu Glu Cys Asp Glu Thr His 
1780 1785 1790 

Pro Cys Glu Gly Leu Gly Arc His Ala ?hr Phe Arg Asn Phe Gly Met 
1795 1800 1805 

Ala Phe Leu Thr Leu Phe Arg Val Ser Thr £ly Asp Asn Trp Asn Gly 
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1815 



1820 



lie Met Lys Asp Pro Ser Arg Asp Cys Asp Gin Glu Ser Thr Cys Tyr 
1825 1830 1835 1840 

Asn Thr Val lie Ser Pro He Tyr Phe Val Ser Phe Val Leu Thr Ala 
1845 1850 1855 

Gin Phe Val Leu Val Asn Val Val He Ala Val Leu Met Lys His Leu 
I860 1865 1870 

Glu Glu Ser Asn Lys Glu Ala Lys Glu Glu Ala Glu Leu Glu Ala Glu 
1875 1880 1885 

Leu Glu Leu Glu Met Lys Thr Leu Ser Pro Gin Pro His Ser Pro Leu 
1890 1895 1900 

Gly Ser Pro Phe Leu Trp Pro Gly Val Glu Gly Val Asn Ser Thr Asp 
1905 1910 1915 1920 

Ser Pro Lys Pro Gly Ala Pro His Thr Thr Ala His lie Gly Ala Ala 
1925 1930 1935 

Ser Gly Phe Ser Leu Glu His Pro Thr Met Val Pro His Pro Glu Glu 
1940 1945 1950 

Val Pro Val Pro Leu Gly Pro Asp Leu Leu Thr Val Arg Lys Ser Gly 
1955 1960 1965 

Val Ser Arg Thr His Ser Leu Pro Asn Asp Ser Tyr Met Cys Arg Asn 
1970 1975 1980 

Gly Ser Thr Ala Glu Arg Ser Leu Gly His Arg Gly Trp Gly Leu Pro 
1985 1990 1995 20C0 



Lys Ala Gin Ser Gly Ser He Leu Ser Val His Ser Gin Pro Ala Asp 
2005 2010 2015 
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Thr Ser Cys lie Leu Gin Leu Pro Lys Asp Val His Tyr Leu Leu Gin 
2020 2025 2030 

Pro His Gly Ala Pro Thr Trp Gly Ala lie Pro Lys Leu Pro Pro Pro 
2035 2040 2045 

Gly Arg Ser Pre Leu Ala Gin Arg Pro Leu Arg Arg Gin Ala Ala lie 
2050 2055 2060 

Arg Thr Asp Ser Leu Asp Val Gin Gly Leu Gly Ser Arg Glu Asp Leu 
2065 20">0 2075 2080 

Leu Ser Glu Val Ser Gly Pro Ser Cys Pro Leu Thr Arg Ser Ser Ser 
2085 2090 2095 

Phe Trp Gly Gly Ser Ser lie Gin Val Gin Gin Arg Ser Gly lie Gin 
2100 2105 2110 

Ser Lys Val Ser Lys His lie Arg Leu Pro Ala Pro Cys Pro Gly Leu 
2115 2120 2125 

Glu Pro Ser Trp Ala Lys Asp Pro Pro Glu Thr Arg Ser Ser Leu Glu 
2130 2135 2L40 

Leu Asp Thr Glu Leu Ser Trp lie Ser Gly Asp Leu Leu Pro Ser Ser 
2145 2150 2155 2160 

Gin Glu Glu Pro Leu Phe Pro Arg Asp Leu Lys Lys Cys Tyr Ser Val 
2165 2170 2175 

Glu Thr Gin Ser Cys Arg Arg Arg Pro Gly Phe Trp Leu Asp Glu Gin 
2180 2185 2190 



Arg Arg His Ser lie Ala Val Ser Cys Leu Asp Ser Gly Ser Gin Pro 
2195 2200 2205 
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Arg Leu Cys Pro Ser Pro Ser Ser Leu Gly Giy Gin Pro Leu Gly Gly 
2210 2215 ' 2220 



Pro Gly Ser Arg Pro Lys Lys Lys Leu Ser Pro Pro Ser Tie Ser lie 
2225 2230 2235 2240 

Asp Pro Pro Glu Ser Gin Gly Ser Arg Pro Pro Cys Ser Pro Gly Val 
2245 2250 2255 

Cys Leu Arg Arg Arg Ala Pro Ala Ser Asp Ser Lys Asp Pro Ser Val 
2260 2265 2270 

f 

Ser Ser Pro Leu Asp Ser Thr Ala Ala Ser Pro Ser Pro Lys Lys Asp 
2275 2280 2285 

Thr Leu Ser Leu Ser Gly Leu Ser Ser Asp Pro Thr Asp Met Asp Pro 
2290 2295 2300 

(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 23 amino acids 
<B) TYPE: amino acid 

(C) STRANDEDNSSS: single 

(D) TOPOLOGY: linear 

£ii) MOLECULE TYPE: DNA (genomic) 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

Arg lie Met Arg Val Leu Arg lie Ala Arg Val Leu Lys Leu Leu Lys 
15 10 15 
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Met Ala Val Gly Met Arg Ala 
20 

[2) INFORMATION FOR SEQ ID NO: 8: 

5 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 
10 (D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: protein 



£15 

8s 



£3 {xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8; 

a- Arg Leu Phe Arg Val Met Arg Leu lie Lys Leu Leu Ser Arg Ala Glu 

- ? ^20 15 10 15 

U1 Gly Val 

1 

25 (2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 amino acids 

(B) TYPE: amino acid 

30 (C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 



Arg Leu Fhe Arg Val Met Arg Leu Val Lys Leu Leu Ser Arg Gly Glu 
15 10 15 

Gly lie 



(2) INFORMATION FOR SEQ ID. NO: 10: 

10 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 amino acids 

(B) TYPE: amino acid 

J5 (C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

m 

«: 

K (ii) MOLECULE TYPE: protein 



h20 

c 

W. (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

■£ 

i. 

Arg Leu Phe Arg Ala Ala Arg Leu lie Lys Leu Leu Arg Gin Gly Tyr 
25 1 5 10 15 



Thr lie 
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What is claimed is: 

1 . A isolated or substantially purified nucleic acid encoding a protein comprising 
at least one domain of a T-type calcium channel a subunit 
5 2. The nucleic acid of claim 1 . which encodes an entire T-type calcium channel 

a subunit. 

3. The nucleic acid of claim 2, wherein said protein comprises SEQ ID NO: 1 , 
SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:6 or a derivative of any of said sequences. 

4. The nucleic acid of claim 1, wherein said protein comprises SEQ ID NO:7. 
10 5 . The nucleic acid of claim 2, wherein said protein gates from about -45 m V to 

about -30 mV in 2 mM Ba 2+ . 

6. The nucleic acid of claim 2, wherein said protein exhibits a tail current of 

from about 2 ms to about 10 ms following repolarization to a membrane potential from 
jg about -80 mV to about -60 mV in a solution with a barium concentration of from about 

JJ 15 10 mM to about 40 mM. 

^ 7. The nucleic acid of claim 2, wherein said protein exhibits a single channel 

frr conductance of from 7 pS to about 10 pS in a solution with a barium ion concentration of 

5 about 100 mM. 

8. A isolated or substantially purified nucleic acid hybridizing to SEQ ID NO:2 
M 20 or SEQ ID NO:4 under high stringency. 

Pi 

n 9. A isolated or substantially puri fied DN A hybridizing to the nucleic acid of 

ifi claim 8. 

. 10. The DNA of claim 9 comprising a sequence encoding a T-type calcium 

channel. 

25 1 1 . A vector comprising the nucleic acid of claim 1 . 

12. A cell into which the vector of claim 1 1 has been introduced. 

13. The cell of claim 12, wherein said nucleic acid is expressed to produce a 
protein. 

14. A method of identifying a drug which affects T-type calcium channels, said 
30 method comprising expressing a T-type calcium channel in a cell, exposing said cell to a 

putative drug, and measuring the calcium flux through the membrane of said cell in 
response to a change in membrane potential. 

15. The method of claim 14, wherein said calcium flux is assayed by using a 
calciun>sensitive labile dye within said cell. 

35 16. The method of claim 14, wherein said calcium flux is assayed by measuring 

the electrophysiological properties of said cell. 
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17. The method of claim 14, wherein said calcium channel comprises SEQ ID 
NO: 1 , SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:6, or a derivative of any of said 
sequences. 

18. The method of claim 14, wherein said calcium channel comprises SEQ ID 

NO:7. 

19. An isolated or substantially purified antibody molecule recognizing an epitope 
on a T-type calcium channel protein. 
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ABSTRACT 

The present invention provides an isolated.or substnatially purified nucleic acid 
encoding a protein comprising at least one domain of a T-type calcium channel a subunit 
and cells expressing such nucleic acids. The present invention also provides isolated or 
substantially purified T-type calcium channels and an isolated or substantially purified 
antibody molecule recognizing an epitope on a T-type calcium channel protein. 
Additionally, the present invention provides an assay for identifying potential drugs 
affecting T-type calcium channels by exposing cells expressing a nucleic acid encoding a 
T-type calcium channel to a putative drug and then measuring the calcium flux in 
response to a change in membrane potential. 



ATGCTCCCCCACCGGGTCCCCCGTTGCGTGAGGACACCTCCTCTGAGGGGCTCCGCTCGCCCCTCT 
1 M LPHRVPRCVRTPPLRGSARPS 

6 6 TCGGACCCCCCGGGGCCCOGGCTGGCCAGAGGATGGACGAGGAGGAGGATGGAGCGGGCGCCGAGGAGTCGGGAC 
23SDPPGPRLARGWTRRRMERAPRSRD 

141 AGCCCCGTAGCTTCACGOVGCTCAACGACCTGTCCGGGGCCGGGGGCGGCAGGGGCCGGGTCGACGGAAAAGGAC 
48 SPVASRSSTTCPGPGAA GAGSTEKD 

216 CCGGGCAGCGCGGACTCCGAGGCGGAGGGGCTGCCGTACCCGGCGCTAGCCCCGGTGGTTTTCTTCTACTTGAGC 
73 PGSADSEAEGLPYPALAPVVFFYLS 

***************************** 

291 CAGGACAGCCGCCCGCGGAGCTGGTGTCTCCGCACGGTCTGTAACCCGTGGTTCGAGCGAGTCAGTATGCTGGTG 
98 QDS RPRS WC LRTVCNPWF E RVS MLV 

******Jgl*** ********* ***** ********* 

366 ATTCTTCTCAACTGTGTGACTCTGGGTATGTTCA^ 

123 ILLKCVTLGMFRPCEDIACDSQRCR 

*************************************X32************************** 

441 ATCCTGCAGGCCTTCGATGACTTCATCTTTGCCTTCTTTGCTGTGGAAATGGTGGTGAAGATGGTGGCCTTGGGC 
148 ILQAFDDFI FAFFAVBMVVKMVALG 

*** ******** **************JQ3 ******************* 

516 ATCTTTGGGAAGAAATGTTACCTGGGAGACACTTGGAACCGGCTTGACTTTTTCATTGTCATTGCAGGGATGCTG 
173 IFGKKCYLGDTWNRLDFFIVIAGML 

*************** ********************* IS4 ******************** 

5 91 GAGTATTQXITGGACCTGCAGAACGTCAGCTTCTCCGC AGTCAGGACAGTCCG TGTG CTGCGACCGCTC AGGGCC 
198 EYS LDL QNV SFSAVRTVR V L R PLRA 

*************************** ****************** 
666 ATTAACCGGGTGCCCAjCK^TGCGCATTCTCGT(^CATTACTGCTGGACACCTTGCCT 

223 INRVP SMRILVTLLLDTLPMLGNVL 

***************** ************ j 35 ************************* ******** 

741 CTGCTCTGTTTCTTCGTCTTTTTCATCTTTGG 

248 LLCFFVFFI FGIVGVQLWAGLLRNR 

816 TGCTTCCTCCCCGAGAACTTCAGCCTCCCCCTGAGCGTGGACCTGGAGCCTTATTACCAGACAGAGAATGAGGAC 
273 CF LPENFSLPLSVDLEPYYQTENED 

891 GAGAGCCCCTTCATCTGCTCTCAGCCTCGGGAG AATGGC ATG AGATC CTG CAGGAGTGTGCCCAC ACTGCG TGGG 
298 ESPFICSQPR. ENGMRSCRSVPTLRG 



Figure 1A 



966 GAAGGCGGTGGTGGCCCACCCTGCAQTCTGGACTATGAGACCTATAACAGTTCCAGCAACACCACCTGrGTCAAC 
323 EGGGGPPCSLDYETYNSSSNTTCVN 



************ 

1041 T<^AACCAGTACTATACCAACTGCTCTGCGGGCGAGCACAACC^ 
348 WNQYYTNCSAGEHNPFKGAINFDNI 

************* x p Loop*************-************************** 

1116 GGCTATGCCTGGATCGCCATCTTCCAGGTCATCACACTGGAGGGCTGGGTCGACATCATGTACTTCGTAATGGAC 
373 GYAWIAIFQVITLEGWVDIMYFVMD 

****************************** Jgg *********** ********* *********** 
1191 GCTCACTCCTTCTACAACTTCATCTACT^ 
398 AHSFYNFIYFILLI IVGSFFMINLC 

1266 CTGGTGGTGATTGCCACGCAGTTCTCCGAGACCAAACAGCGGGAGAGTCAGCTGATGCGGGAGCAGCGTGTACGA 
423 LVVIATQFSETKQRESQLMREQRVR 

1341 TTCCTGTCCAATGCTAGC&CCCTGGCAAGCTTCT 
448 FLSNASTliASFSEPGSCYEELLKYL 

14 16 GTGTACATCCTCCGAAAAGCAGCCCGAAGGCTGGCCCAGGTCTCTAGGGCTATAGGCGTGCGGGCTGGGCTGCTC 
473 VYILRKAARRL AQVS RAIGVRAGLL 

1491 AGCAGCCCAGTGGCCCGTAGTGGGCAGGAGCCCCAGCCCAGTGGCAGCTGCACTCGCTCAC ACCGTCGTCTGTCT 
498 S S PVARSG QE PQP SGS CTRSHRRLS 

1566 GTCCACCACCTGGTCCACCACCATCACCACCACCATCAC 
523 VBBXiVH HBHHHBHHYHLGNOTL RV P 

1641 C^JGGCCAGCCCAGAGATCCAGGACAGGGATGCCAATGGGTCTC 
548 RAS PE I QDR DANG S RRLML P P P S T P 

1716 ACTCCCTCTGGGGGCCCTCCGAGGGGTGCGGAGTCTGTACACAGCTTCTACCATGCTGACTGCCACTTGGAGCCA 
573 TPSGGPPR, GAESVH SFYHADCHLEP 

1791 GTCCGTTGCCAGGCACCCCCTCCCAGATGCC(^^ 
598 VRCQAPPPRCPSEASGRTVGSGKVY 

1866 CC CACTGTGCATACCAGCCCTCC^CCAGAGAT^ 
623 PTVHTSPPPEILKDKALVBVAPSPG 

1941 CCCCCCACCCTCACCAGCTTCAACATCCCACCTGGGC^ 
648 PPTLTSFNIPPGPFSSMHKLLETQS 

2016 ACGGGAGCCTGCCATAGCTCCTGCAAAATCTCC^GCCCTTGCTCCAAGGCAGACAGTGGAGCCTGCGGGCCGGAC 
673 TGACHSSCKISSPCSKADSGACGPD 

2091 AGTTGTCCCTACTGTGCCCGGACAGGAGCAGGAGAGCCAGAGT^ 
698 SCPYCARTGAGEPESADHVMPDSDS 

2166 GAGGCTGTGTATGAGTTCACACAGGACGCTCAGCACAGTGACCTCCGGGATCCCCACAGCCGGCGGCGACAGCGG 
723 EAVYEFTQDAQHSDLRDPHSRRRQR 

2241 AGCCTGGGCCC AGATGCAGAGCCTAGTTCTG TGCTGGCT TTCTGG AGG CTG ATCTGTGAC AC ATTCCGGAAGATC 
748 SLGPDAEPS SVLAFWR LI CDTFRK I 



Figure IB 



****************************** *****ugl*********** *************** 
2316 GTAGATAGCAAATACTTTGGCCGGGGAATC&TGATCG 
773 VDSKY FGRG IMI AILVNTLSMG I EY 



*** ************************************ 

2391 CACGAGCAGCCCGAGGAGCTCACCAACGCCCTGGAAATCAGCAACATCGTCTTCACCAGCCTCTTCGCCTTGGAG 
798 HEQPBELTNALE I SNIVFTSLFALE 



***+HS2 ************** ******* ************* ******ng3** 

2466 ATGCTGCTGAAA^TGCTTGTCTACGGTCCCTTTGGCTACATTAAGAATCCCTACAACATCTTTG 
823 MLLKLLVYGPFGYI KNPYNIFDGV I 

********************************* ********************* 
2541 GTGGTCATCAGTGTGTGGGAGATTGTGGGCCAGCAGGGAGGTGGCCTGTCGGTGCTGCGGACCTTCCGCCTGATG 
848 VVISVWEIVGQQG GGLSVLRTFRLM 

******************* JIS4 ******************* ******* *** 

2 6 16 CGGG TG CTGAAG CT GGTGCGCTT C C TG CCGG C CC TGC AG C G CCAG CTCGTG GTGC TC ATG AAGACCATGG ACAAC 
873 RVLKLVRFLPALQRQLVVLMKTMDM 

i= ******* *************************JXS5 ***************** ******************** 

^ 2691 GTGGCCACCTTCTGCATGCTCCTCATGCTGl^ 

-jf* 898 VATFCM. LLMLFI FI FS ILGMHLFGC 



U} *** *************************** 

2766 AAGTTCGCATCTGAACGGGATGGGGACACGTTGCCAGAC^ 
Q 923 KFASERDGDTLPDRKNPDSLLWAIV 



**********jX Pore Loop********************* ****** 

2841 ACTGTCTTTCAGATTCTGACTCAGGAAGACTGGAATAAAGTCCTCTACAACGGCATGGCCTCCACATCGT^ 



948 TVFQILTQEDWNKVLYNGMASTSSW 



********************* ************* U3 5******************** *************** 

2916 GCTGCTCTTTACTTCATCGCCCTCATGACTTTT 
973 AALYFIALMTFGNYVLFNL LVAILV 

********* 

2991 GAAGGATTCCAGGCAGAGGGAGATGCCACCAAGTCTGAGT 
998 EGFQAEGDATKS ES EPDFFSPSVDG 

3066 GATGGGGACAGAAAGAAGCGCTTGGCCCTGGTGGCTTTG<5GAGAACACGCGGAACTACGAAAGAGCCTTTTGCCA 
1023 DGDRKKRLALVALG EH AELRKSLLP 

3141 CCCCTCATCATCCATACGGCTGCGACACCAATGTCACACCCCAAGAGCTCCAGCACAGGTGTGGGOGAAGCAC'I'G 
1048 PLIIHTAATPMSHPKSSSTGVGEAL 



3216 GGCTCTGGCTCTCGACGTACCAGTAGCAGTGGCTCCGCTGA^ 

1073 GSGSRRTSSSGSAE PGAAHHEMKC P 

32 91 CCAAGTGCCCGCAGCTCCCCGCACAGTCCCTGGAGTG CGGCAAGCAG CTGGACCAGCAGGCGCTCCAGCAGGAAC 
1098 PSARSSPHSPWSAASSWTSRR SSRN 

3366 AGC CTGGGCCGGGCCCCC AGCCTAAAGCGGAGGAGCCCGAGCGGGG AGCGGAGGTC CCTGCTGTCTGGAGAGGGC 
1123 SLGRAPSLKRRSPSGERRSLLSGEG 



Figure 1C 



44 1 CAGGAGAGTCAGGATGAGGAGGAAAGTTCAGAAGAGGACCGGGCCAGCCCAGCAGGCAGTGACCATCGCCACAGO 

^^L148 QESQDEBESSEEDRASPAGS DHRHR 

3516 GGTTC CTTGG AACGlXjAGGCCAAG AGTTCCTTTG AC CTGCCTGACACTCTGCAGGTGCCGGG GCTGCACCGCACA 

1173 GSLEREAKSSFDLPDTLQVPGLHRT 

3591 GCCAGCGGCCGGAGCTCTGCCTCTGAGCACCAAGACTGTAATGGCAAGTCGGCTTCAGGGCGTTTGGCCCGCACC 

1198 ASGRSSASEHQDCNGKSASGRLART 

3666 CI'GAGGACTGATGACCCCCAACTGGATGGGGATGATGAC^TGATGAGGGAAATCTGAGCAAAGGGGAACGCATA 

1223 LRTDDPQLDGDDDNDEGNLSKGERI 

3741 CAAGK!CTGGGTCAGATCCCX3GCTTCCTGCCTGTTGCCGAGAGC^ 

1248 QAWVRSRLPACCRERDSWSAYIFPP 



lb 



3B16 CAGTCAAGGTTTCGTCTCCTGTGTCACCX3GATC& 

1273 QSRFRLLCHRI ITHKMF0HVVLVI I 

************************************ 

3891 TTCCTCAACTGTATCACCATC^CTATGGAGCGCCCCAAAATTGACCCCCACAGCGCTGAG 
^ 1298 FbNCIT IAMERPKIDPHSAERI FLT 

as 

T -~ ************ **********************^ i IS2 ******** ******************** 

3966 CTCTCCAACTACATCTTCACGGCAGTCTTTCTAGCTGAAATGACAGTGAAGGTGGTG^CACTGGGCTGGTGCTTT 
1323 LSNYIFTAVFLAEMTVKVVALGWCF 

******************************** **^J[J^Q%* + * ******************************* * 
4041 GGGGAGCAGGCCTACCTGCGCAGCAGCTGGAATGTGCTC^ 

1348 GBQAYLRSSWNVLDGLLVLISVIDI 

********* ****************************** 
4116 CTGGTCTCCATGGTCTCCGACAGCGGCACCAAGATCCTTGGC ATGCTGAGGGTGCTGCGGCTG CTGCGGAC CCTG 
1373 LVSMVSDSGT KILGML RVLR LLRTL 

*************** xi IS4 ************* ******* *** 

4191 CGTCCACTCAGGGTC^TCAGCCXtGGCCCAGGG 

1398 RPLRVISRAOGLKLVVETLMS SLKP 

**************** *************-k****J^^jLS$* ************************** ******* 

4266 ATTGGCAACATTGTGGTC^TTTGCTGTGCCTTCTTCATCATTTTTGGAATTCTCGGGGTGCAGCTCTTCAAAGGG 
1423 IGNIVVICCAFFI I FGILGVQLFKG 



4341 AAGTTCTTCGTGTGTCAGGGTGAGGACACCAGGAACATCACTAACAAATCCGACTGCGCTGAGGCCAGCTACCGA 
1448 KFFVCQo'eDTRNI TNKSDCAEASYR 

***********+********+**UX p Loop***************** 

4416 TGGGTCCGGCACAAGTACAACTTTGACAACCTGGGCCAG^ 

1473 WVRHKYNFDHliGQALMSIiFVLASKD 



v3 
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********************* *** 

'4491 GGTTGGGTTGACATCATGTATGATGGGCTGGATGCT^ 

1498 GWVD IMYDGLDAVGVDQQP IMHHNP 



********************************* II IS6 *********************************** 

4 566 TGGATGCTGCTATACTTCATCTCCTTCCTCCTCATCGTGGCCTTCTTTGTCCTGAACATGTTTGTGGGCGTGCjTG 
1523 WMLLYPISFLLIVAFFVLNMFVGVV 

************ 

4641 GTGGAGAACTTCC ATAAGTG CAGACAG CACC AGGAGG AGGAGGAGGCG AGGCGGCGTGAGGAG AAGCGACTACGG 
1546 VENFHK CRQHQEEEEARRREEKRLR 

4716 AGGCTGGAGAAAAAGAGAAGGAGTAAGGAGAAGC AGATGGCCGAAGCCCAGTGCAAGCCCTACTACTCTGACTAC 
1573 RLEKKRRSKEKQ MAEAQCKPYYSDY 

************************ ivs 2.**** 

4 791 TCGAGATTCCGGCTCCTTGTCCACCACCTGTGTACCAGCCACTACCTGGACCTCTTCATCACTGGTGTCATCaGG 
1598 SRFRLLVHHLCTSHYLDLFITGVIG 

********************************* ****** 

p 4866 CTGAACGTGGTCACTATGQCCATGGAACATTACCAGCAGCCCCAGATCCTGGACGAGGCTCTGAAGATCTGCAAT 

5 1623 LNVVTMAMEHYQQPQI LDEALKICN 

f£ *************************** HIS 2 *************** *********** 

y1 4941 TAC&TCTTTACCGTCATCTTTGTCTTTGAGTCAGT^^ 

Q 1648 YIFTVI FVFESVFKLVAFAFRRFFQ 

US; 

^3 ******* ********************IVS 3 ************* ************** 

s 5016 GACAGGTGGAACCAGCTGGACCTGGCTATTGTGCTTCTGTCCATCATGGGCATCACACTGGAGGAGATTGAGGTC 
J* 1673 DRWN QLDLAIVLLS I M G I TLE E IEV 

******************** ****IVS4 ********* 

6 5091 AATCTGTCGCTGCCC^TCAACCCCACCATCA^ 
1690 NLSLPINPTIIRIMRVLRIARVLKL 



*************************** ****************** 

5166 TTG AAG ATGG CTG TGGGCATGCGGG CACTGCTGCACACG GTG ATGC AGG C CCTG C C CC AGGTGGGG AAC CTGGG A 
1723 L KMAVGMRALLHTVMQALPQVGNLG 

************************* IVS5 ************************ ******** 
524 1 CTTCTCTTCATGTTATTGTTTTTCATCT 

1748 LLFMLLFFI FAALGVELFGDLE CDE 

***************** jy p LQQp ******* 

5316 ACACACCCTTGTGAGGGCTTGGGTCGGCATGCCAC^ 

1773 THPCEG LGRHATFRNFGMAFLTLFR 

***************************** ******* 
5391 GTCTCCACTGGTGACAACTGGAATGGTATTATGAA 

1798 VSTGDWWKGIMKDPSRDCDQESTCY 

*********************** ***********jvs 6 ***************** ************ 

5466 AAC^CTGTCATCTCCCCTATCTACTTTGTGTCCTTCGTGCTGACGGCCCAGTTTGTGCT 

1823 NTVISPIYFV SFVLTAQFVLVNVVI 
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***** ****** ********** 

5541 GCTGTG CTG ATGAAGCACCTGGAAGAAAG CAACAAAGAGGCCAAGGAGGAGGCCG AGCTCGAGGCCG AGCTGG AG 
1848 AVLMKHLEESNKEAKE EAELEAEL E 



56 16 CTGGAGATGAAGACGCTCAGCCCGCAGCCCCACTCCCCGCTGGGCAGCCCCTTCCTCTGGCCCGGGGTGGAGGGT 

1873 LEMKTLSPQPHSPLGS PFLWPGVEG 

5691 GTCAACAGTACTGACAGCCCTAAGCCTGGGGCTCCACACACCACTGCCCACATTGGAGCAGCCTCGGGCTTCTCC 

1838 V NSTDSPKPGAPHTTAHI GAASGFS 

5766 CTTGAGCACCCCACGATGGTACCCCACCCCGAGGAGGTGCC^^ 

1923 LEHPTMVPHPEEVPVPLGPDLLTVR 

5841 AAGTCTGGTGTCAGCCGGACGGACTCTCTGCCCAATGACAGCTACATGTGCCGCAATGGGAGCACTGCTGAGAGA 

1948 KSGVSRTHSLPNDSYMCRNGSTAER 

5916 TCCCTAGGACACAGGGGCTGGGGGCTCCCCAAAGCCCAGTCAGGCTCCATCTTGTCCGTTCACT 

1973 SLGHRGWGLPKAQS GS I LSVHSQPA 

5991 GACACCAGCTGCATCCTACAGCTTCCCAAAGATGTGCA 

1998 DTSCILQLPKDVHYLLQPHGAPTWG 

5 ^ 6066 GCCATCCCTAAACTACCCCCACCTGGCCGC TCCCCTCTGGCTCAGAGGCCTCTCAGGCGCCAGG CAGCAATAAGG 

|| 2023 AI PK LPPPGRS PLAQRPLRRQAAI R 

if t 6141 ACTGACTCCCTGGATGTGCAGGGCCTGGGTAGCCGGGAAGACCTGTTGTCAGAGGTGAGTGGGCCCTCCTGC CCT 

2048 TDSLDVQGLGSREDLLSEVSGPSCP 

6216 CTGACCCGGTCCTCATCCTTCTGGGGCGGGTCGAGCATCCAGGTGCAGCAGCGTTCCGGCATCCAGAGCAAAGTC 

L ~ 2073 LTRSSSFWGGSS I QVQQRSG IQSKV 

Jj; 62 91 TCCAAGCACATCCGCCTGCCAGCCCCTTGCCCAGGCCTGGAACCCAGCTGGGCCAAGGACCCTCCAGAGACCAGA 

2098 SKHIRLPAPCPGLEP S WAKDPPETR 

6366 AGCAGCTTAGAGCTGGACACGGAGCTGAGCTGGATTTCAGGA^^ 

-ii 2123 SSLELDTELSWISGDLLPSSQEEPL 

6441 TTCCCACGGGACCTGAAGAAGTGCTACAGTGTAGAGACCCAGAGCTGCAGGCGCAGGCCTGGGTTCTGGCTAGAT 
2148FPRDLKKCY SV ETQSCRRRPGFWLD 

6516 GAACAG CGG AGACACTCC ATTGCTGT CAGC TGTCTGG AC AGCGG CTCCCAACCC C G CCTATGTCCAAGC CCC TCA 

2173 EQRRHSIAVSCLDSGSQPRLCPSPS 

6591 AGCCTCGGGGGCC&ACCTCTTGGGGGTCCTGGGAGCC 

2198 SLGGQPLGGPGSRPKKKLSP.PS IS I 

6666 GACCCCCCGGAGAGCCAGGGCTCTCGGCCCCCATGCAGTCCTGGTGTCTGCCTCAGGAGGAGGGCX5CCGGCCAGT 

2223 D'PPESQGSRPPCSPGVCLRRRAPAS 

6741 GACTCTAAGGATCCCTCGGTCTCCAGCCCCCTTGACAG(^CGGCTGCCTCACCCTCCC<^U^GAAAGACACGCTG 

2248 DSKDPSVSSPLDS TAASPSPKKDTL 

6816 AGTCTCTCTGGTTTGTCTTCTGACCCAACAGACATGGACCCCrG SEQ ID JTO:l 

2273 S L S G L S S D P T D. M* D. P, ® SEQ ID NO:l 
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1 ATGACCGAGGGCGCACGGGCCGCCGACGAGGTCCGGGTGCCCCTGGGGCGCCGCCCCTGGCCCTGCGGCGTTGGT 
26MTEGARAADEVRVPLGRRPWPCGVG 

7 6 GGGGGCGTCCCCGGAGAGCCCCGGGGCGCCGGGACGCGAGGCGGAGGGGGGTTCGAGCTCGGCGTGTCACCCTCC 
51GGVPGEPRGAGTRGGGG FELGVS PS 

151 GAGAGCCCGGCGGCCGAGCGCTGCGCGGAGCTGGGTGCCGACGAGGAGCAGCGCGTCCCGTACCCGGCCTTGGCG 

76ESPAAERCAELGADEEQRVPYPALA 

****** 

226 GCC^CGGTCTTCTTCTGCCTCGGTCAGACCACGCGGCCGCGCAGCTGGTCCGTCCGGCTGGTCTGCAACCCATGG 
101AT VFFCLGQTTRPRSWSVRLVCNPW 

************************ JS1 ************* ********************** 

301 TTCGAGCSkCGTGAGCATGCTGGTAATCATGCTCAACTGrc 

126 FEHVSMLVIMLNCVTLGMFRPCEDV 

*************** ********* I S2 *********** 

376 GAGTGCGGCTCCGAGCGCTGCAACATCCTGGAGGCCTTTGACGCCTTCATTTTCGCCTTTTTTGCGGTGGAGATG 
151 ECGSBRCNILEAFDAF I FA.FFAVEM 

*************************** ********************* 

451 GTC^TCAAGATGGTGGCCTTGGGGCTGTTCGGGCAGAAGTGTTACCTGGGTGACACGTGGAACAGGCTGGATTTC 
176 VIKMVALGLFGQKCYLGDTWNRLDF 

**************! 33 *************** ****** ****************** 

526 TTCATCGTCGTGGCGGGCATGATGGAGTACTCGTTGGACGGACACAACGTGAGCCTCTCGGCTATCAGGACCGTG 
201 FIVVAGMMEYSLDGHNVSLSAIRTV 

********** **is 4 ***** ********** ******************** 

601 OGGGTGCTGCGGCCCCTCCGCGCCATCMCCGCGTGCCTAGCATGCGGATCCTGGTCACTCTGCTGCTGGATACG 
226 RVLRPLRAINRVPSMR I LVTLLLDT 

****************************** I35*********************************** 
676 CTGCCCATGCTCGGGAACGTCCTTCTGCTGTGCTTCTTGGTCTTCTTCATTTTCGGCA 

251 LPMLGNVLLL CFFVFFI FGIVGVQL 
**** 

751 TGGGCTGGCCTCCTGCGGAACCGCTGCTTCCTGGACAGTGCCTTTGTCAGGAACAACAACCTGACCTTCCTGCGG 
276 WAGLLRNRCFLDSAFVRNNNLTFLR 



826 CCGTACTACCAGACGGAGGAGGGCGAGGAGAACCCGrrCATCTGCTCCTCACGCCGAGACAACGGCATGCAGAAG 
301 PYYQTEEGEENPF ICS SRRDNGMQK 



901 TGCTCGCACATCCCCGGCCGCCGCGACGTGCGCATGCCCT 

326 CSHIPGRRDVRMPCTLGWEAYTQPQ 



976 GCCGAGGGGGTGGGCGCTGCACGCAACGCCTGCATCAACTGGAACC^GTACTACAACGTGTGCCGCTCGGGTGAC 
351 AEGVGAARNAC INWNQYYNVCRSGD 
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**************************! p 



Loop************** 

1051 TCCAACCCCC&CAACGGTGCCATCAA^ 
376 SNPHNGAINFDNTCYAWIAI FQVIT 

*************************** *************************** 
1126 CTGGAAGGCTGGGTGGACATCATGTACTACGTCATGGACGCCCACTCATTCTACAACTTCATCn'ATTTCATCCTG 
401 LEGWVDIMYYVMDAH S FYNFIYFIL 

********************* Ig § ************************ *********** 

1201 CTCATCATCGTGGGCT^CTTCTTCATG^^ 
426 LIIVGSFFMINL'CLVVIATQFSETK 

1276 CAGCGGG AG AGT C AGCTGATG CGGG AGCAGCGGG CACG CCAC CTG T C C AACGACAGCACG CTGGC CAGCT TCTCC 
451 QRESQLMREQRARHLSNDS TLASFS 



□ 1351 GAQCCTGQCAGCTGCTACGAAGAGCTGCTGAAGTACGTGGGCCACATATTCCGCAAGGTCAAGCGGCAGCTTGCG 
K 476 E PGSCYEELLKYVGHI FRKVKRQLA 

.Tj 

\y> 1426 CCTCTACGCCCGCTGGCAGAGCCGTGGCGCAAGAAGGTGGACCGC^GTGCTGTGCAAGGCCAGGGTCCCGGGCAC 
@ 501 PLRPLAEPWRKKVDPSAVQGQGPGH 

O 

f 1501 CGCCAGCGCCGGGCAGQ CAGGCACACAGCCTCGGTGCACCACCTGGTCTACCACCACCATCACCACCACCACCAC 

y, 526 RQRRAQRHTASVHHLVYHHHHHHHH 

fa: 

1576 CACTACCATTTCAGCCATGGCAGCCCCCGC^GGCCCGGCCCCGAGCCAGraCGCCTGCGACACCAGGCTGGTCCGA 

H; 551 HYHFSHGSPRRPGPEPGACDTRLVR 

ILr 

1651 GCTGGCGCGCCCCCCTCGCCACCTTCCCCAGGCCGCGGACCCCC(^ACGCAGAGTCTGTGCACAGCATCTACCAT 
576 AGAPPSPPSPGRGPPDAESVHSIYH 



1726 GCCGACTGCCACATAGAGGGGCCGCAGGAGAGGGCCCGGGTGGGCACATGCCGCAGCCACTGCCGCTGCCAGGCT 
601 ADCHI EGPQERARVGT CRSHCRCQP 



1801 CAGGCTGGCCACAGGGCTGGGCACCATGAAC TACCCC ACGATCCTGCCCTC AGGGGTGGGCAGCGGCAAAGG CAG 
626 QAGHRAGHHE L PHDPALRGGQRQRQ 



1876 CACCAGCCCCN3GACCCAAGGGGAAGTG<MCCGGTGGACC 
651 HQPRTQGEVGRWTARHRGHGPLSLN 



1951 AGCCCTGATCCCTACGAGAAGATCCCGCATGTGGCCGGGGAGC^^ 
676 SPDPYEKIPHVAGEHGLAS PGHLSG 
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( 2 02 6 CTCAGTGTGCCCTGCCCCCTGCCCAGCCCCCCAGCGGGCACACTGACCTQTGAGCTGAAGAGCT6CCCGTACTGC 
701 IiSVPCPLPSPPAGTLTCELKSCPYC 



2101 ACCCGTGCCC1X5GAGGACCCGGAGGGTGAGCTCAGCGGCTCGGAAAGTGGAGACTCAGATGGCCGTGGCGTCTAT 
726 TRALEDPEGELSGSESGDSDGRGVY 



2176 GAATTCACGCAGGACGTCCGG CACGGTGACCGCTGGGACCCCACGCG ACCACCCCGTG C GACGGACAC ACC AGGC 
751 EFTQDVRHGDRWDPTRPPRATDTPG 



2251 CCAGGCCCAGGCAGCCCCCAGCGGCGGGCACAGCAGAGGGCAGCCCCGGGCGAGCCAGGCTGGATGGGCCGCCTC 

776 PGPGS PQRRAQQ RAAPGEPGWMGRL 

2326 TGGGTTACCTTCAGCGGCAAGCTGCGCCGCATCGTGGACAGCAA 

801 WVTPSGKIiRRIVDSKYFSRGIMMA I 

************************************ *** 

2401 CTTGTC^CACGCTGAGCATGGGCGTGGAGTACCATGAGCAGCCCGAGGAGCTGA 

826 LVNTLSMGVEYHEQPEELTNALE I S 



************************US 2 **************************** ******** 

24 76 AACATCGTGTTCACCAGCATGTTTGCCCTGGAGATGCTGCTGAAGCTGCTGCGCGCTGTCCCTCTGGGCTACATC 
851N IVFTSMFALEMLLKLLRAVPLGYI 

************************ jjg3 ******** ***************** 

2551 CGGAACCCGTACAACATCTTCGACGGCATCATCGTGGTC& 
876 RNPYN1FDGIIVVISVWEIVGQADG 

************************ j IS4 ******* *************************** 

2626 GGCTTGTCTGTGCTGCGCACCTTCCX3GCTGCTGCGTGTGCTG 
901GLSVLRTFRLLRVLK LVRFLPALRR 

****** ********************XXS5******************** 

2701 CAGCTCGTGGTGCTGGTGAAGACCATGGACAACGTGGCTACCTTC 
926 QLVVLVKTMDNVATFCTLLMLF I F I 

************************************ 

2776 TTCAG CATCCTGGGCATGCAC CTTTTCGGCTGCAAGTTCAGCCTGAAGACAGACACCGG AGAC ACCGTGCCTGAC 
951 FS ILGMHLFGCKFS LKTDTGDTVPD 

**************************** j j p Loop************************** 

2851 AGGAAGAACTTCGACTCCCTGCTGTGGGCCATCG 
976 RKWFDSLLWAIVTVFOI LTQEDWNV 
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■ ********* ******************************* j XSS********* 

2926 GTCCTOTACAAOGGCATGCKTCTCCaCCTCCTCCTGC^CGCC 

1001 VLYNGMASTSSWAALYFVALMTFG N 

************** ********** *********** ********** 

3001 TATGTGCTCTTCAACCTGCTGGTGGCCATC^ 

1026 YVLFNLLVAILVEGFQAEGDAMRSD 

3076 AC6G ACGAGGACAAGACGTCGGTCCACTTCGAGGAGGACTTCCACAAGCTCAGAG AAC TCCAGAC CACAGAGCTG 
1051 T DEDKTSVHFEEDFHKLRELQT TEL 

3151 AAGATGTGTTCCCTGGCCGTGACCCCCAACGGCACCTGGAGGG ACGAGGCAGCCTGTCCCCTCCCCTCATCATGT 
1076 KMCSLAVTPNGTWRDEAACPLPSSC 

3226 GCACAGCTCCCACGCCCATGCCTACCCCCAAGAGCT^ 

1101 AQLPRPCLP PRAHHSWMQP PASQTL 

3301 GGC GT GG CAG CAG C AG CTCCGGGGAC CCG C CACTGGG AG AC CAG AAG CC TC CGG CAG CCTC C GAAGTTCTCCCTG 
1X26 GVAAAA PG T R H WE T R S LRQ PPKFS L 

3376 TGCCCCCTGGGGCCCAGTGGCGCCTGGAGCAGCCGGCGCTC^ 

11S1 CPLGPSGAWSSRRSSWSSLGRAQPQ 

3451 GCGCCGGCGTGCCAGTGTGGGGAACGTGAGTCC CTGCTGTC TGGCX3AGGGCAAGGGCAGCACCGACGACGAAGCT 
1176 APACQCGERESLLSGEGKGSTDDEA 

3S26 GAGGACGGCAGGGCGCGCTCCGGGCCCCGTGCCACCCCACT 

1201 EDGRARSGPRATPLRRAESLDPRPL 

3601 CGGCGGCCGCCTCCCGCCTACCAAGTGCGCGATCGCGACGGGCAGGTCGTGGCCCTGCCCAGCGACTTCTTCCTG 
1226 RRPPPAYQVRDRDGQVVAL PSDFFL 

3676 CGCATCGACAGCCACCGTGAGOATGGAGCCGAGCTTGACGACXJAC^ 

1251 RI DSHREDAAELDDDS EDSCCLRLH 

3751 AAAGTGCTGGTGCCCTACAAGCCCCAGCGGTGCCGGAGCAGGAGGCCTGGGCCCTCTACCCTCTACCTCn*TCTCC 
1276 KVLVPYKPQRCRSRRPGPSTLYLFS 

****************** J J 

3B26 CCACAGAACCGGTTCCGCGTCTCCTGCCAGAAGGTCATCAC^ 

1301 PQNRFRVSCQKVI THKM FDHVVLVF 

****************************** ********* 

3901 ATCTTCCTCAACTGCGTCACCATCGCCCTC^ 

1326 IFLNCVTIAL. ERPDIDPGSTERVPL 

********************** ******** J J IS2 *** + ***+************* + **++** ******* 

3976 AGCGTCTCCAATTACATCTTCACGGCCATCTTCGTGGCGGAGATGATGGTGAAGGTGGTGGCCCTGGGGCTGCTG 
1351 SVSNYIFTAIFVAEMMVKVVALGLL 

************************ j j jg3 ***************** 

4051 TCCGGCGAGCACGCCTACCTGCAGAGCAGCTGGAACCTGCTGGATGGGCTGCTGGTGCTGGTGTCCCTGGTGGAC 
1376 SGEHAYLQSSWNLLDGLLVLVSLVD 
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****************** It******** 



4126 ATTGTCGTGGCCATGGCCTCGGCTGGTGGCGCCAAGATCCTGGGTGTTCTGCGCGTGCTGCGTCTGCTGCG6ACC 
1401 I VVAMAS AGGAK I LGVLRV LRLLRT 

*********XXXS4************************** +* * 

4201 CTGCGGCCTCTGAGGGTCATCAGCCGGCCCCGGCTCAAGCTGGTGGTGGAGACGCTGATATCATCACTCAGGCCC 
1426 „ L R P L R V I S R PR L KIiVVETL I S S I»R P 



************************* ********IHS5 ********************** ************* 

4276 ATTGGGAACATCGTCCTCATCTGCTGCGCCTTCTTCATCATTTTTGGCATTTTGGGTG 

1451 IGNIVLICCAFFI IFGILGVQLFKG 
**+ 

4351 AAGTTCTACTACTGCGAGGGCCCCGACACCAGGAACATCTCCACCAAGGCACAGTGCCGGGCCGCCCACTACCGC 
1476 KFY YCEGPDTRNISTKAQCRAAHYR 

***************** ********ju p Looj>**** *********** 

4426 TGGGTGCGACGCAAGTACMCTTCGACAACCTGGGCCAGGCCCTGATGTCGCTGTTCGTGCTGTCATCCAAGGAT 
1501 WVRRKYNFDH LGQALMS LFVLSSKD 



fjj ********************* *** 

i@ 4501 GGATGGGTGAACATCATGTACGACGK^3CTGGATGCCGTGGGTGTCGACCAGCAGCCTGTGCAGAACCACAACCCC 
K 15 26 GWVWIMYDGLDAVGVDQQPVQNHNP 

p5 *******+********+*************** *******u i £5 6** **************** *********** 

P 4576 TGGATGCTGCTGTACTTCATCTCCrTCCTCTGCT 

J3 1551 WMLLYFISFLCYIVSFFVLNMFVGV 

irk' *************** 

Hj 4651 GTGGTCGAGAACTTCCACAAGTGCCGGCCGCACCAGGAGGCGGAGGAGGCGCGGCGGCGAGAGGAGAAGCGGCTG 
Q 1576 VVENFHKCRPH QEAEEARRREEKRL 

if? 

Tf> 4726 CGGCGCCTAGAGAGGAGGCGCAGGAGCACTTTCCCCAGCCCAGAGGCCCAGCGCCGGCCCTACTATGCCGACTAC 
1601 RRLERRRRS TFPSPEAQRRP YYADY 

*************+*+******** xvsi*** 

4801 TCGCCCACGCGCCGCCGCTCC^rTC^CTaSCTGTGCAC 

1626 S PTRRRS IHSLC TSHYLDLFITFI I 

★★★★A******************************* *** 
4876 TGTGTCAACGTCATCACCATGTCCATGGAGCACTAT^ 

1651 CVNVI TMS MEH YNQPK S 1»D E ALKYC 

************************ IVS2 ********************************** 

4951 AACTACOTCTTCACCATCGTGTTTGTCTTCGAG^ 

1676 NYVFTIVFVFEAALKLVAFGFRRFF 

************************XVS3 ************ ******* ******* **** 

5026 AAGG AC AGGTGGAAC CAG CTGG ACCTGGC CATCGTGCTGCTGTCACTCATGGG CATCACGCTGGAGGAG ATAGAG 
1701 KDRWNQLDLAIVLLSLMGITLEEIE 
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************************ IVS4 ********** *** 

5101 ATGAGCGCCGCGCTGCCCATCAACCCCACCMCM 

1725 M S A A L P tUPTI IRIMRVLRIARVLK 



****************************** *************** 

5176 CTGCTGAAGATGGCTACGGGCATGCGCGCCCTGCTGGACACTGTGGTGCAAGCTCTCCCCCAGGTGGGGAACCTG 
1751 LLKMATGMRALLDTVVQALPQVGNL 

************************ IVS5 ********* **************************** 

5251 GGCCTTCTTTTC&TGCTCCTGTTTTTTATCTA 

1776 GLLFHLIiFFIYLRLGVELFGRLEC S 

**************** jV P Loop******* 

5326 GAAGACAACCCCTGCGAGGGCCTGAGCAGGCACGCCACCTTCAGCAACTTCGGCATGGCCTTCCTCACGCTGTTC 
1801 EDNPCE GLS RHATFSNFGMAFLTLF 

*************************************** 

5401 CGCGTGTCCACGGGGGACAACTGGAACGX3GATCATGAAGGACACGCTGCGCGAGTGCTCCCGTGAGGACAAGCA^ 
1826 RVSTGDNWNGIMKDTLRECSREDKH 

£3 ************************ ivs s ******************** ******** 

5476 tgcctgagctacctgccggccccgtcgcccgtctac^^ 

1851 clsylpapspvyfvtfvlvpqfvlv 

thr ********************************* 

G 5551 aacgtggtggtggccgtgctcatgaagcacctggaggagagcaacaaggaggctcgggaggatgcggagctg 

-Jj 1876 NVVVAVLMKHLEESNKEAREDAEL D 

E 

H 5626 gccgagatcgagctggagatggcgcagggccccgggagtgcacgccgggtggacgcggacaggcctcccttgccc 

K 1901 AEIELEMAQGPGSARRVDADRPPLP 

3 

\f* 5701 CAGGAGAGTCCGGCGKrCAGGGACGCCCCAAACCTGGTTGCACGCAAGGTGTCCGTGTCCAGGATCTCT 

a jJ 1926 QES PAPGTPQTWLHARC PCPGS LAA 

%.? 

5776 CAACGACAGCTACATGTTCAGkSCCCGTGGTGCCTGCCTCGGCGCCCCGGGCCCGCCCGCTGCAGGAGGTGGAGAT 
1951 Q RQLHVQARGACLGAPGPPAAGGGD 

5851 GGAGACCTATGGGGCCGGC^CCCCCTTGGAGTCCTGTGCCATCCCATCCAGATCCCATTGGCTGTGTCGAACCCA 
1976 GDLWG RHPLGVLCHP IQ I PLAVSNP 

5926 GCCAGGAGCGGCGAGCCCCTCCACGCCCTGTCCCCTCGGGGCAC^GGCGCTCCCCCAGTCTC^GCCGGCTGCTCT 
2001 ARSGEPLHALSPRGTAAPPVSAGCS 

6001 GCAGACAGGAGGCTGTGCACACCGATTCCTTGGAftGGGAAGATTCACAGCCCTAGGGACACCCTGGATCCTGCAG 
2026 ADRRLCTPIPWKGRLTALGTPWILQ 

6076 AGCCTGGTGAGAAACCCCCGG SEQ ID NO; 3 
2051 S L V R N P R SEQ ID NO: 3 
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RLFRVMRLIKLLSRAEGV sbq id no 8 

RLFRVMRL VKLL S RGEG I seq id no 9 

RLFRVMRLVKLLSRGEGI seq id no 9 

RLFRAARL I KLLRQG YT I seq id no 10 

RLFRAARL IKLLRQGYT I seq id ho 10 

KLFRAARLIKLLRQGYTI seq id no 10 
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